Google Search API to support and provide `cached_link` HTML | Voters | SerpApi, LLC - Old roadmap, we've just migrated our public roadmap to our Github `serpapi`

Google Search API to support and provide `cached_link` HTML

under review

Justin O'Hara

Google Search API supports a

cached_page_link

field, looking to create a new API to scrape the HTML page from the

cached_page_link

February 7, 2022

Anand Chhatpar

To add more context here:

The problem is that trying to fetch the link for the Google cache of a webpage for more than a few dozen or so pages gets our IP to be blocked by Google. Since you already have the infrastructure to not get blocked by Google, I would love the ability to use your infrastructure to fetch that cached page from Google. This could even be billed a separate API call.

Illia Zub

Anand Chhatpar, do you want to get only the HTML (via

search_metadata.raw_html_file

)? We won't extract data from the HTML because we can't parse any website at the moment.

Anand Chhatpar

Illia Zub: Hi Ilya, Yes, the HTML of Google's cache will work for us. You don't need to parse it. Google's cache has an additional option for a "Text Only version" as you'll see in the screenshot that Justin posted, which is another thing you could return without parsing, and that version would be even more helpful for us. Thank you!

Anand Chhatpar

It would be great if the API could also support retrieval of the text-only version that Google webcache provides. That's just one additional query parameter.

Justin O'Hara

marked this post as

under review

Keith Schacht

Justin O'Hara: I'd love an update on this. I'm eager for this feature as well. I've tried requesting both Google webcache and Bing webcache and neither is reliable. SerpAPI is in a great position to offer this.

Any update?

Illia Zub

Keith Schacht: Hi Keith, we currently don't support this feature but internally we are discussing it.