LOTUS Makes LLM-Powered Data Processing Fast and Easy

LOTUS is a query engine for processing structured and unstructured data with LLMs

Get Started in a Few Lines of Code

LOTUS provides an intuitive Python package and familiar Pandas-like API with LLM-powered semantic operators.
Open in Colab

        papers_df.sem_filter("the {abstract} has an open source repo")
            .sem_topk("the {abstract} has the most ground-breaking ideas", K=20)
            .sem_agg("summarize the papers based on their {abstract}")
                      

The Power of Semantic Operators

LOTUS implements the semantic operator model, a powerful and declarative programming model for AI-based data transformations.

Declarative Programming

Specify your data logic with declarative, high-level operators. Then leave the rest to the query engine!

Highly Optimized Execution

LOTUS automatically optimizes your programs, for up to 400x speedups

Seamless Integration

Semantic operators seamlessly extend the relational model, making it easy for you to leverage your structured and unstructured data together

Use Cases

LOTUS serves a diverse array of applications that need to process data with AI. Here are some examples, each written in short & intuitive LOTUS programs.

Fact-Checking

LOTUS programs reproduce and improve upon state-of-the art fact-checking accuracy pipelines on the FEVER dataset, while optimizing execution to acheive 28x speedups.

Medical Classification

LOTUS acheives state-of-the art accuracy with a single semantic operator on the BioDEX dataset, which presents a complex medical classification task. Under the hood, the LOTUS query engine automatically explores feasible execution plans to achieves 400x faster performance than the default.

Search and Ranking

LOTUS programs acheive 200% higher accuracy than state-of-the-art retrieval and re-ranking methods, while also providing query efficiency with up to 10x lower execution time than LM-based methods used by prior works.

Research Insights

Simple LOTUS programs process large sets of recent ArXiv papers allows you to provide summaries, and group the data based on topics, answer complex research questions.

Team

The LOTUS project is ongoing work from researchers at Stanford and Berkeley University.

Liana Patel

Stanford University

Sid Jha

UC Berkeley

Parth Asawa

UC Berkeley

Melissa Pan

UC Berkeley

Harshit Gupta

Stanford University

Onkar Deshpande

Stanford University

Carlos Guestrin

Stanford University

Matei Zaharia

UC Berkeley