【JavaScript】讀取商品頁面中的結構化數據（JSON-LD），在不改動服務端情況下，實現一對一跳轉

前端實踐：從商品頁面讀取 mpn 并實現一對一跳轉

在實際開發中，我們經常會遇到這樣一種需求：
用戶瀏覽 A 網站的商品頁面后，點擊按鈕能夠直接跳轉到 B 網站的對應商品。

表面看似只是一個按鈕跳轉，但如果不同商品需要精確映射，就必須找到每個商品的唯一標識。這時，很多電商頁面都會通過 結構化數據（JSON-LD） 提供關鍵信息，例如 mpn（Manufacturer Part Number，制造商零件號）。

1. 問題背景

我需要在商品詳情頁中獲取唯一標識 mpn，并基于它拼接目標商店的跳轉鏈接。
頁面本身是第三方的，不能直接改動服務端返回的數據。
目標效果：隱藏原始「加入購物車 / 立即購買」按鈕，替換為一個自定義按鈕，點擊后跳轉到另一個電商平臺（例如 DHgate）的對應商品。

2. 數據獲取的波折

一開始，我嘗試直接在頁面 DOM 中尋找 mpn：

document.querySelector() 去找標簽 → 失敗
在 meta 標簽里查找 → 失敗
在頁面可見文本里搜索 → 失敗

直到我注意到頁面 <head> 區域存在大量 <script type="application/ld+json"> 標簽，這是用于 SEO 的結構化數據。
里面通常包含：

{"@context": "https://schema.org/","@type": "Product","name": "Luxury Custom Handmade Watch","mpn": "123456789","brand": {"@type": "Brand","name": "Artisan Watches"}
}

注解：
JSON-LD（JavaScript Object Notation for Linked Data）是 Google 推薦的結構化數據標注方式。
電商網站會用它向搜索引擎說明商品屬性，比如：名稱、價格、庫存、SKU、MPN 等。

這意味著 最佳的獲取方式是解析 JSON-LD，而不是從 DOM 樹中“扒值”。

3. 解決方案：解析 JSON-LD 提取 mpn

代碼實現如下：

// 獲取所有 JSON-LD 腳本標簽
const ldJsonScripts = document.querySelectorAll('script[type="application/ld+json"]');let mpnValue = null;// 遍歷解析 JSON，找到 Product 類型的 mpn
ldJsonScripts.forEach(script => {try {const jsonData = JSON.parse(script.textContent);if (jsonData["@type"] === "Product" && jsonData.mpn) {mpnValue = jsonData.mpn;}} catch (e) {console.error("解析 JSON-LD 出錯:", e);}
});

這樣就能穩健地提取出 mpn 值。

4. 替換默認按鈕并添加跳轉邏輯

由于目標是 一對一跳轉，我需要屏蔽原有的購物入口，改為跳轉到目標鏈接。

4.1 隱藏原有按鈕

原始按鈕 ID 具有統一前綴（如 add-cart-xxx、buynow-xxx），所以通過前綴匹配批量隱藏：

function hideElementsWithPrefix(prefix) {const allElements = document.getElementsByTagName('*');for (let element of allElements) {if (element.id && element.id.startsWith(prefix)) {element.style.display = 'none';}}
}

并且定時執行，防止異步加載的按鈕漏網：

setInterval(() => {hideElementsWithPrefix('add-cart-');hideElementsWithPrefix('buynow-');
}, 1000);

4.2 添加自定義按鈕

const customButton = document.createElement('button');
customButton.textContent = 'Buy Now on DHgate';
customButton.style.cssText = `background-color: #D4AF37;color: white;border: none;padding: 12px 24px;font-size: 16px;font-weight: bold;border-radius: 6px;cursor: pointer;margin: 10px 0;transition: background-color 0.3s ease;
`;// 懸停效果
customButton.addEventListener('mouseover', () => customButton.style.backgroundColor = '#C89A2E');
customButton.addEventListener('mouseout', () => customButton.style.backgroundColor = '#D4AF37');// 點擊跳轉
customButton.addEventListener('click', () => {const targetUrl = `https://www.dhgate.com/product/french-artisanal-luxury-exquisite-custom/${mpnValue}.html?`;window.location.assign(targetUrl);// 備份方案：防止 assign 失敗setTimeout(() => {if (window.location.href !== targetUrl) {window.location.href = targetUrl;}}, 100);
});

插入到頁面：

const insertionPoint = document.querySelector('.product-cart-group') ||document.querySelector('.main_btn').parentNode ||document.body;insertionPoint.appendChild(customButton);

5. 最終效果

頁面原有的「加入購物車 / 立即購買」按鈕被隱藏。
出現一個新的金色按鈕 “Buy Now on DHgate”。
點擊時，根據當前頁面解析到的 mpn 值，自動拼接跳轉鏈接并跳轉。

例如：

https://www.dhgate.com/product/french-artisanal-luxury-exquisite-custom/123456789.html

非常好 👍
你提到的這一點非常關鍵：作為后端，很多人會覺得前端拿個值“應該很簡單”，但實際落到細節上，經常會踩坑。
你經歷 mpn 的抓取，就是一個典型的前端數據獲取思路問題。

下面我以“從一個商品頁面獲取關鍵數據”為引，系統整理前端常見的數據抓取手段，帶例子和注解。

前端數據抓取手段

前端里，想要拿到一個頁面的數據，不一定總是 document.querySelector("#id") 那么直白。

1. 直接 DOM 查詢

最直觀的方法就是 通過選擇器直接查找元素。

// 通過 ID
const price = document.querySelector("#product-price").textContent;// 通過 class
const title = document.querySelector(".product-title").innerText;// 通過標簽層級
const stock = document.querySelector("div.product-info span.stock").innerText;

注解：

如果頁面是 SSR（服務端渲染），DOM 一開始就有數據，這是最快捷的方式。
但對于 SPA（單頁應用），很多內容是后續 JS 渲染的，需要等 window.onload 或 MutationObserver。

2. 結構化數據（JSON-LD、Microdata、RDFa）

之前遇到的 mpn，就是嵌在 <script type="application/ld+json"> 里的。

const scripts = document.querySelectorAll('script[type="application/ld+json"]');
scripts.forEach(s => {try {const json = JSON.parse(s.textContent);if (json["@type"] === "Product") {console.log(json.mpn, json.sku, json.brand?.name);}} catch (e) {}
});

注解：

這種方式對 SEO 友好，Google / Bing / 電商比價插件都會用。
如果在做 商品比對 / 跨站跳轉 / 爬蟲，這是最可靠的切入點。

3. 隱藏在 meta 標簽 / 屬性中

一些頁面會把關鍵數據放在 <meta>、data-* 屬性里，給搜索引擎或前端腳本用。

例子：meta 標簽

<meta property="product:price:amount" content="99.99">
<meta property="product:sku" content="A12345">

獲取：

const price = document.querySelector('meta[property="product:price:amount"]').content;
const sku = document.querySelector('meta[property="product:sku"]').content;

例子：data-* 屬性

<button id="buyBtn" data-sku="A12345" data-stock="20">Buy Now</button>

獲取：

const btn = document.getElementById("buyBtn");
console.log(btn.dataset.sku);   // "A12345"
console.log(btn.dataset.stock); // "20"

注解：

data-* 是 HTML5 規范推薦的“私有數據通道”，比 id / class 更穩定。
在電商、新聞站、CMS 里很常見。

4. 頁面內嵌 JSON 配置（window 變量 / inline script）

有些網站在 <script> 里會直接掛一個全局對象，供頁面其他 JS 使用。

例子：

<script>window.__INITIAL_STATE__ = {product: {id: "12345",name: "Luxury Watch",price: 299.99,mpn: "XYZ987"}};
</script>

獲取：

console.log(window.__INITIAL_STATE__.product.mpn);

注解：

SPA 框架（React/Vue/Next.js）經常這么做，把后端數據注入到頁面里。
在 f12 → Console 輸入 window，多翻幾頁，常能挖到完整的數據結構。

5. Ajax / Fetch 請求攔截

很多頁面并不是直接渲染，而是前端在加載時通過 Ajax/FETCH 請求接口。
這類數據可以直接抓接口，而不是“扒 DOM”。

例子：

商品頁打開時，瀏覽器可能會請求：

GET https://api.shop.com/product/12345

{"id": "12345","title": "Luxury Watch","price": "299.99","mpn": "XYZ987"
}

在前端可以這樣攔截：

// 重寫 fetch
const originalFetch = window.fetch;
window.fetch = async (...args) => {const response = await originalFetch(...args);if (args[0].includes("/product/")) {response.clone().json().then(data => {console.log("抓到商品數據:", data);});}return response;
};

注解：

這是 最穩定 的方案，因為你直接拿到原始 JSON 數據。
后端開發者更熟悉接口，所以只要能找到 API 請求，比 DOM 抓取更靠譜。
在 DevTools → Network 里找 XHR / Fetch，常能發現目標接口。

6. 監聽 DOM 變化（MutationObserver）

如果頁面異步渲染（比如 React/Vue 延遲加載），DOM 里可能一開始沒有數據。
這時需要用 MutationObserver 來監聽節點變化。

const observer = new MutationObserver(mutations => {const priceNode = document.querySelector(".product-price");if (priceNode) {console.log("價格:", priceNode.textContent);observer.disconnect(); // 拿到后停止監聽}
});observer.observe(document.body, { childList: true, subtree: true });

注解：

適合異步加載的 SPA。
比 setInterval 更優雅，但邏輯復雜一些。

7. Canvas / Shadow DOM / 加密數據（特殊情況）

一些網站會反爬蟲，把數據寫在：

Canvas 渲染出來（你只能截圖識別）
Shadow DOM（需要穿透 .shadowRoot）
Base64 / 加密字符串（要解碼/解密）

這種情況屬于“反爬蟲”范疇，日常不常見，但電商/金融站點有時會遇到。

總結

方法	特點	適用場景	難度
DOM 選擇器	直觀，容易寫	靜態頁面、SSR	?
JSON-LD / Microdata	結構化，標準化	商品詳情、SEO 站點	??
meta / data-*	隱藏但容易讀	電商、CMS	??
window 全局變量	完整數據，格式化	React/Vue 注水頁面	??
Ajax / Fetch	原始 JSON，最穩定	SPA、電商 API	???
MutationObserver	處理異步渲染	Vue/React 延遲加載	???
Canvas / Shadow DOM	反爬蟲用	金融、電商防護	????