Browserbase
Browserbase is a developer platform to reliably run, manage, and monitor headless browsers.
Power your AI data retrievals with:
- Serverless Infrastructure providing reliable browsers to extract data from complex UIs
- Stealth Mode with included fingerprinting tactics and automatic captcha solving
- Session Debugger to inspect your Browser Session with networks timeline and logs
- Live Debug to quickly debug your automation
Installation and Setup
- Get an API key and Project ID from browserbase.com and set it in environment variables (
BROWSERBASE_API_KEY,BROWSERBASE_PROJECT_ID). - Install the Browserbase SDK:
%pip install browserbase
Loading documents
You can load webpages into LangChain using BrowserbaseLoader. Optionally, you can set text_content parameter to convert the pages to text-only representation.
import os
from langchain_community.document_loaders import BrowserbaseLoader
load_dotenv()
BROWSERBASE_API_KEY = os.getenv("BROWSERBASE_API_KEY")
BROWSERBASE_PROJECT_ID = os.getenv("BROWSERBASE_PROJECT_ID")
loader = BrowserbaseLoader(
api_key=BROWSERBASE_API_KEY,
project_id=BROWSERBASE_PROJECT_ID,
urls=[
"https://example.com",
],
# Text mode
text_content=False,
)
docs = loader.load()
print(docs[0].page_content[:61])
Loader Options
urlsRequired. A list of URLs to fetch.text_contentRetrieve only text content. Default isFalse.api_keyBrowserbase API key. Default isBROWSERBASE_API_KEYenv variable.project_idBrowserbase Project ID. Default isBROWSERBASE_PROJECT_IDenv variable.session_idOptional. Provide an existing Session ID.proxyOptional. Enable/Disable Proxies.
Related
- Document loader conceptual guide
- Document loader how-to guides