Tensorlake
Tensorlake is the AI Data Cloud that reliably transforms data from unstructured sources into ingestion-ready formats for AI Applications.
The langchain-tensorlake
package provides seamless integration between Tensorlake and LangChain,
enabling you to build sophisticated document processing agents with enhanced parsing features, like signature detection.
Tensorlake feature overview
Tensorlake gives you tools to:
- Extract: Schema-driven structured data extraction to pull out specific fields from documents.
- Parse: Convert documents to markdown to build RAG/Knowledge Graph systems.
- Orchestrate: Build programmable workflows for large-scale ingestion and enrichment of Documents, Text, Audio, Video and more.
Learn more at docs.tensorlake.ai
Installation
pip install -U langchain-tensorlake
Examples
Follow a full tutorial on how to detect signatures in unstructured documents using the langchain-tensorlake
tool.
Or check out this colab notebook for a quick start.
Quick Start
1. Set up your environment
You should configure credentials for Tensorlake and OpenAI by setting the following environment variables:
export TENSORLAKE_API_KEY="your-tensorlake-api-key"
export OPENAI_API_KEY = "your-openai-api-key"
Get your Tensorlake API key from the Tensorlake Cloud Console. New users get 100 free credits.
2. Import necessary packages
from langchain_tensorlake import document_markdown_tool
from langgraph.prebuilt import create_react_agent
import asyncio
import os
3. Build a Signature Detection Agent
async def main(question):
# Create the agent with the Tensorlake tool
agent = create_react_agent(
model="openai:gpt-4o-mini",
tools=[document_markdown_tool],
prompt=(
"""
I have a document that needs to be parsed. \n\nPlease parse this document and answer the question about it.
"""
),
name="real-estate-agent",
)
# Run the agent
result = await agent.ainvoke({"messages": [{"role": "user", "content": question}]})
# Print the result
print(result["messages"][-1].content)
Note: We highly recommend using openai
as the agent model to ensure the agent sets the right parsing parameters
4. Example Usage
# Define the path to the document to be parsed
path = "path/to/your/document.pdf"
# Define the question to be asked and create the agent
question = f"What contextual information can you extract about the signatures in my document found at {path}?"
if __name__ == "__main__":
asyncio.run(main(question))
Need help?
Reach out to us on Slack or on the package repository on GitHub directly.