Any URL linked to from another page may be indexed!
SEO usually means improving things like keyword rankings, organic traffic & conversions.
However, in rare but “all-hands-on-deck” cases, knowledge of how search engines work helps in removing sensitive information from Google’s index quickly. Think accessible client information shared online but not meant to be seen by the general public.
Here’s a step-by-step process to find & remove this content:
- Use the search operator “site:domain.com” on Google to find most indexed pages.
- Pull backlinks from SEMrush or another tool according to patterns found. Any of these URLs could potentially enter the index, so ensure the following steps for them. Crawl your own site too, again with the patterns found in mind.
- Apply the pattern (ex: same subfolder),or list every URL individually found in the Removals tool within Google Search Console. (Do this in Bing’s Block URL tool as well.) For a limited set of URLs, the URL inspection tool in conjunction may speed up the removal.
- Remove content from the index permanently:
- Removing the page via 404/410 error is best, if you don’t need that content anymore.
- Use a noindex meta robots tag <meta name=”robots” content=”noindex” /> in the head section. For PDFs and other non-HTML-based content, use -Robots-Tag: noindex in the HTTP header. Do this if the content should remain for users otherwise. Don’t use robots.txt, which only works if there’s no problem in the first place!
- Consider gating (password-protecting) the content.
- If a legal issue exists, remove content from Google or use the Personal Information Removal Request Form.
While ideally taking 2-3 days, unfortunately it can take longer depending on the circumstances, perhaps 3-4 weeks.