browser-use WebUI
- 一、browser-use是什么
- Browser-use采用的技術棧為:
- 二、browser-use webui 主要功能
- 使用場景
- 三、使用教程
- 1.python 安裝
- 2、把項目clone下來
- 3、安裝依賴
- 4、配置環境
- 5、啟動
- 6、配置
- 1.配置 Agent
- 2.配置要用的大模型
- 3.關于瀏覽器的一些設置
- 四、DeepSeek 的API獲取
- 五、界面Demo 演示
- 六、代碼示例
- 1.創建Agent
- 七、實例展示
- 一、爬取基金數據
- 二、效果展示
- 1、導航到指定網址
- 2、點擊基金排行
- 3、提取top 10的基金數據
- 三、總結
一、browser-use是什么
Browser Use 是一款開源Python庫,專為大語言模型設計的智能瀏覽器工具,目的是讓 AI 能夠像人類一樣自然地瀏覽和操作網頁。它支持多標簽頁管理、視覺識別、內容提取,并能記錄和重復執行特定動作。Browser Use 還支持開發者自定義動作,如保存數據到數據庫,文件等。支持多種主流的大型語言模型,如 DeepSeek,GPT-4 和 Claude等,并支持同時運行多個任務,具備自我修正功能,從而提高任務執行的準確性和效率。
官網:https://browser-use.com/
項目網址 :https://github.com/browser-use/browser-use
Browser-use采用的技術棧為:
- 1、Observation:頁面解析層,采用DOM解析+截圖輔助的非視覺+視覺方案。
- DOM解析(HTML + XPath):Browser-use通過底層框架(如Playwright)獲取當前頁面的完整HTML結構,并提取文本、元素屬性等關鍵信息。
- 截圖輔助:在某些情況下(如驗證碼識別、動態圖形驗證),純HTML解析可能無法直接獲取信息,此時系統會自動或按需生成頁面截圖,并將截圖作為輔助輸入傳遞給視覺模型
- 2、Thought:核心決策層,分析Observation提供的頁面信息并生成操作指令。
- 3、Action:指令執行層,微軟開發的Playwright作為瀏覽器控制框架直接與瀏覽器交互完成自動化任務。Playwright作為新一代高性能UI自動化測試框架,提供低延遲、高穩定性的瀏覽器控制能力,支持快速頁面加載和元素操作。
二、browser-use webui 主要功能
提供了全新的網頁界面,簡單好用,方便操作。
支持更多大語言模型,比如 Gemini、OpenAI、Azure 等,哦,還有最近爆火的國產大模型 DeepSeek,未來還會加更多。
支持用自己的瀏覽器,不用再反復登錄,還能錄屏。
定制了更智能的 Agent,通過優化后的提示讓瀏覽器使用更高效。
使用場景
- 自動化任務:適合重復高頻的瀏覽器操作任務,如表單填寫,信息檢索,文件下載
- 數據收集:適合爬取網絡上的數據,如爬蟲自動化測試:適合WEB UI
- 自動化測試,結合pytest輕松實現web自動化
三、使用教程
1.python 安裝
python 官網: https://www.python.org/downloads/版本必須在 3.11 以上。
2、把項目clone下來
git clone https://github.com/browser-use/web-ui.git
cd web-ui
3、安裝依賴
pip install browser-use
playwright install
pip install -r requirements.txt
4、配置環境
基于 .env.example 復制一個 .env 文件,并在 .env 文件中修改以下信息
# 路徑 Chrome 瀏覽器路徑(檢查下自己的路徑),例如
# Mac OS "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"
# Windows "C:\Program Files\Google\Chrome\Application\chrome.exe"
CHROME_PATH="/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"# 瀏覽器的用戶數據路徑,例如
# Mac OS "/Users/<YourUsername>/Library/Application Support/Google/Chrome"
# Windows "C:\Users\<YourUsername>\AppData\Local\Google\Chrome\User Data"
CHROME_USER_DATA="/Users/<YourUsername>/Library/Application Support/Google/Chrome"# 還有一些大模型的 API Key 也要改
...
5、啟動
執行如下命令啟動
python webui.py --ip 127.0.0.1 --port 7788
啟動成功如下所示:
瀏覽器訪問 http://127.0.0.1:7788/,看到如下界面就成功了
6、配置
1.配置 Agent
注意,這里的 Use Vision,默認是選中狀態,如果使用的 DeepSeek 不能勾選,因為 DeepSeek 不支持視覺輸入,注意這里很多人踩坑,一定要注意。
2.配置要用的大模型
例如,下面我用的是 deepseek。
3.關于瀏覽器的一些設置
四、DeepSeek 的API獲取
DeepSeek :https://platform.deepseek.com/api_keys
五、界面Demo 演示
輸入要執行的任務就可以點擊 Run Agent 了
以下是運行時的項目日志輸出,記錄了執行步驟
執行的過程中也會打開瀏覽器和跳轉到目標網站,按照區塊一樣對頁面元素做標注。
在 Recodings 下會記錄執行過程和反饋結果,還可以回看的。
六、代碼示例
1.創建Agent
from langchain_openai import ChatOpenAI
from browser_use import Agent
import asyncioasync def main():agent = Agent(task="Go to Reddit, search for 'browser-use' in the search bar, click on the first post and return the first comment.",llm=ChatOpenAI(model="gpt-4o"),)result = await agent.run()print(result)asyncio.run(main())
如果沒有openai-key的可以使用其他模型,下面以DeepSeek為例:
該文件在 browser-use/examples/deepseek.py
import asyncio
import osfrom dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from pydantic import SecretStrfrom browser_use import Agent# dotenv
load_dotenv()api_key = os.getenv('DEEPSEEK_API_KEY', 'sk-xxxxxx')
if not api_key:raise ValueError('DEEPSEEK_API_KEY is not set')async def run_search():agent = Agent(task=("1. 在搜索框中輸入抖音并搜索"'2. 點擊搜索結果中的第一個鏈接''3. 關閉掃碼登錄' '3. 返回第一個視頻的內容'),llm=ChatOpenAI(base_url='https://api.deepseek.com/v1',model='deepseek-chat',api_key=SecretStr(api_key),),use_vision=False,)await agent.run()if __name__ == '__main__':asyncio.run(run_search())
運行結果如下:
七、實例展示
一、爬取基金數據
llm = ChatOpenAI(model='deepseek-chat',api_key='*************',base_url='https://api.deepseek.com',temperature=0
)asyncdefmain():agent = Agent(task="""1、導航到網址:https://fund.eastmoney.com/2、點擊基金排行3、返回排行前10的基金數據,以json格式返回""",llm=llm,use_vision=False,)result = await agent.run()print(result.final_result())asyncio.run(main())
二、效果展示
1、導航到指定網址
2、點擊基金排行
3、提取top 10的基金數據
DEBUG [browser_use] --act Execution time: 0.00 seconds
INFO [controller] 📄 Extracted from page
: ```json
{
"top_10_funds": [{"rank": 1,"fund_code": "018124","fund_name": "永贏先進制造智選混合發起A","date": "03-07","unit_net_value": "2.1654","cumulative_net_value": "2.1654","daily_growth_rate": "2.21%","1_week": "9.15%","1_month": "17.58%","3_months": "64.06%","6_months": "191.13%","1_year": "115.94%","since_inception": "116.54%","handling_fee": "0.15%"},{"rank": 2,"fund_code": "018125","fund_name": "永贏先進制造智選混合發起C","date": "03-07","unit_net_value": "2.1501","cumulative_net_value": "2.1501","daily_growth_rate": "2.21%","1_week": "9.15%","1_month": "17.54%","3_months": "63.89%","6_months": "190.55%","1_year": "115.07%","since_inception": "115.01%","handling_fee": "0.00%"},{"rank": 3,"fund_code": "016530","fund_name": "鵬華碳中和主題混合A","date": "03-07","unit_net_value": "1.7881","cumulative_net_value": "1.7881","daily_growth_rate": "3.00%","1_week": "10.23%","1_month": "22.92%","3_months": "68.07%","6_months": "178.39%","1_year": "104.00%","since_inception": "78.81%","handling_fee": "0.15%"},{"rank": 4,"fund_code": "016531","fund_name": "鵬華碳中和主題混合C","date": "03-07","unit_net_value": "1.7685","cumulative_net_value": "1.7685","daily_growth_rate": "3.00%","1_week": "10.21%","1_month": "22.86%","3_months": "67.81%","6_months": "177.59%","1_year": "102.79%","since_inception": "76.85%","handling_fee": "0.00%"},{"rank": 5,"fund_code": "001970","fund_name": "泰信鑫選靈活配置混合A","date": "03-07","unit_net_value": "1.3310","cumulative_net_value": "1.3310","daily_growth_rate": "-1.04%","1_week": "7.25%","1_month": "4.80%","3_months": "31.00%","6_months": "125.59%","1_year": "95.45%","since_inception": "33.10%","handling_fee": "0.15%"},{"rank": 6,"fund_code": "002580","fund_name": "泰信鑫選靈活配置混合C","date": "03-07","unit_net_value": "1.3220","cumulative_net_value": "1.3220","daily_growth_rate": "-0.97%","1_week": "7.31%","1_month": "4.84%","3_months": "31.15%","6_months": "125.60%","1_year": "95.27%","since_inception": "31.67%","handling_fee": "0.00%"},{"rank": 7,"fund_code": "016295","fund_name": "新華利率債債券E","date": "03-07","unit_net_value": "1.7977","cumulative_net_value": "1.9906","daily_growth_rate": "-0.13%","1_week": "-0.06%","1_month": "-0.66%","3_months": "0.85%","6_months": "94.13%","1_year": "92.89%","since_inception": "99.00%","handling_fee": "0.00%"},{"rank": 8,"fund_code": "019457","fund_name": "平安先進制造主題股票發起A","date": "03-07","unit_net_value": "1.7593","cumulative_net_value": "1.7593","daily_growth_rate": "1.78%","1_week": "10.41%","1_month": "23.92%","3_months": "57.40%","6_months": "134.29%","1_year": "90.71%","since_inception": "75.93%","handling_fee": "0.15%"},{"rank": 9,"fund_code": "019458","fund_name": "平安先進制造主題股票發起C","date": "03-07","unit_net_value": "1.7452","cumulative_net_value": "1.7452","daily_growth_rate": "1.78%","1_week": "10.40%","1_month": "23.87%","3_months": "57.17%","6_months": "133.60%","1_year": "89.59%","since_inception": "74.52%","handling_fee": "0.00%"},{"rank": 10,"fund_code": "007713","fund_name": "華富科技動能混合A","date": "03-07","unit_net_value": "1.4600","cumulative_net_value": "1.5100","daily_growth_rate": "1.47%","1_week": "8.19%","1_month": "18.94%","3_months": "47.07%","6_months": "135.41%","1_year": "89.14%","since_inception": "51.93%","handling_fee": "0.15%"}]
}
二、結合pytest實現頁面自動化測試
異步執行需要安裝插件pytest-asyncio
pip install pytest-asyncio
@pytest.mark.asyncio
@pytest.mark.parametrize("username,password,expected", [("kevin@xxxx.com", "a123456", "kevin"),("kevin@xxxx.com", "123456", "賬號密碼輸入錯誤")])
asyncdeftest_login(username, password, expected):agent = Agent(task=f"""1、導航到網址:https://xxxxxxx.com2、輸入用戶名:{username}, 密碼:{password}3、點擊登錄按鈕4、驗證是否登錄成功,登錄成功返回{expected}""",llm=llm,use_vision=False,)result = await agent.run()assert expected in str(result.final_result())
三、總結
Browser Use 深度集成大語言模型(LLM),通過語義理解與視覺決策鏈實現零硬編碼自動化,徹底顛覆傳統腳本開發模式。AI 自動解析頁面結構、動態生成操作路徑,無需人工編寫 XPath/CSS 定位器,開發效率提升 5 倍以上,尤其擅長處理動態驗證、反爬策略及多步驟交互場景,成為金融數據抓取、跨平臺測試的新一代智能引擎。