In order to compete with OpenAI’s ChatGPT Google launched its version of LLM based chatbot on 6th February last year as a beta and over time it expanded to different countries. Bard uses LaMDA - Language Model for Dialogue Applications machine learning model, which was first showcased by Google back in 2021. But after that there was no direct product released by Google which utilised this model. Bard was the first product which used this model. Despite using one of the most advanced and technically complex models, users didn’t find Bard more useful compared to ChatGPT that’s why behind the scene Google have been working on training new language models which are more robust and can compete with models developed by OpenAI. On 6th Dec 2023, Google’s CEO Sundar Pichai and DeepMind CEO Demis Hassabis made an announcement showcasing Google’s new set of Language Models which are Gemini Ultra, Gemini Pro and Gemini Nano. In the announcement blog post, they also mentioned that Gemini Ultra performs way better than competitor model GPT-4 on different industry benchmarks. On the same day of announcing new Gemini model, Google shift Bard from using LaMDA as underlying model to Gemini Pro and later on in just 2 months. Google rebranded Bard as Gemini. That’s why they moved bard.google.com to gemini.google.com Earlier bard subdomain can be used for accessing the chat interface but exactly on 8th Feb Google changed this and then for accessing chat interface gemini.google.com subdomain have to be used. Even this redirect was set up in not an ideal way. It should have been just one redirect from bard.google.com to gemini.google.com but interestingly just 3 days after this subdomain was redirected I noticed that there are 4 redirects in between. I’m not exactly sure what’s intended purpose of Google’s engineers and SEOs to have 4 redirects when ideally it should be just one. Another technical error that their team made while migrating this is robots.txt Google engineers migrated subdomain from Bard to Gemini but missed on following best practices for technical SEO which is necessary to ensure accurate visibility of pages from the site in Google’s index. Let me walk you through what the exact issue right now is and how Google engineers (or may be SEOs) can solve this. As of 12 Feb just 4 days after rebranding and domain redirecting I noticed that private chat URLs agave started to show up in Google search which means these URLs are both crawable and indexable by Google. This lead to Gemini URLs of format gemini.google.com/share/(numbers & alphabets string) All these private chats should not be indexed and show up in Google Search. Alright these chat URLs were showing up but the next question is WHY? Let me explain this further by showing an actual example of a Gemini conversation and creating a shareable link. Now this URL of format https://g.co/gemini/share/404e23b88462 when accessed redirects to So Googlebot fetches this URL https://gemini.google.com/share/404e23b88462 then checks it robots file for this domain allows it to crawl the content. But it's robots file in first 2 weeks of February had following content Its clear that /share/ is disallowed which means, Google can’t fetch the content of gemini urls having /share/ in it. But because gemini is on google.com subdomain that’s why Google systems may be considering that these URLs are important enough. Yes without looking at content, Google systems made the decision to index these and therefore in search console indexing report it would have shown up as indexed but blocked by robots.txt But digging more it can be clearly noticed that these /share/ URLs have no-index tag implemented. But Googlebot can’t even see that tag because it's blocked from crawling the page using robots file. As Googlebot can’t crawl content of page that’s why its unable to see no index tag and its just going ahead with indexing of those /share/ URLs. Alright I noticed this indexing of URLs for first time on 12th Feb and since then I have highlighted it multiple times on Twitter and even have tagged multiple Googlers so that they can pass this to their engineers/SEO and get this issue fixed. Eventually I think that after mentioning the same thing again and again, it got noticed by Googlers on Twitter and they end up making some changes on 27 Feb which ultimately lead to Gemini chats dropping off from Google index. They made changes to robots.txt file on 27 Feb and that’s how issue got resolved. 26 Feb robots.txt for gemini.google.com was 27 Feb robots.txt for gemini.google.com was Difference What was happening before What's happening right now Now with /share/ not being disallowed from robots file, Google is able to fetch these pages, then process them and see NO INDEX tag which is implemented there. That's why most of these /share/ pages have dropped out of index now and I can only see 10 now (on 5th March), my best guess soon those will also be gone from index and as of 10th March I can just see 3 of these remaining in index. Which I think will drop off in the coming weeks. Update on 15 March Just to check it again and see if URLs are still showing up in Search Results, today I again did site:gemini.google.com/share/ query. Google Search just showed me 2 chat URLs, I think that Google have not fetched these URLs again after 26th when change in robots file was implemented by Google. I will keep on checking this over upcoming weeks and will update the story here. Update on 23 March I can still see same 2 URLs showing up in Search Results for query site:gemini.google.com/share/ I think that perhaps Google have not yet fetched these 2 URLs that's why its not yet released that indexing status of these URLs have been changed and now there's a no index tag implemented on these pages. Update on 25 April It been many weeks since I last updated this article on 15th March because I've been super busy doing SEO for our clients and helping them achieve top ranking in search results and on top of that I've been doing a lot of analysis of changes in ranking & traffic due to March Core Update which started rolling out on 5th March. Again coming back to Gemini and indexing of private chats in Search Results, as of 25 April these have gone and I'm not seeing any chat URL ranking in search results now.
Google engineers missed on implementing the same robots.txt roles which they ended up implementing after I highlighted the issue of Bard private conversions being indexed and ranking on Google.Issue - Gemini Chats Showing Up in Google Search
Technical Deep Dive - Why Gemini Chats got indexed?
Any Gemini or previously Bard Chat can be shared with others by creating a publicly shareable link (same can be done ChatGPT as well) but when those links started to show up in blog posts, tweets or linkedin posts. Googlebot picked those up and started to index while they shouldn’t be.
Here I’m asking Google Bard - Why Xugar is best SEO Agency in Melbourne - then after following couple of steps, I got a shareable link.
https://gemini.google.com/share/404e23b88462
Same happens for all of Shareable URLs created by Gemini. I think that initial creation of URL shows up as g.co because that’s one of most common way to share urls on web nowadays (just like bit .ly)
Ok now when this Gemini URL shows up somewhere on Internet and Googlebot finds it then of course it will try to fetch it and see what’s there on this page (of course that’s how Googlebot works)Let me explain how Google fixed this technical issue?
User-agent: *
Allow: /app/download
Disallow: /app/
Disallow: /chat/
Disallow: /share/
Sitemap: https://gemini .google. com/sitemap. xml
User-agent: *
Allow: /app/download
Disallow: /app/
Disallow: /chat/
Sitemap: https://gemini. google. com/sitemap. xml
Google removed
Disallow: /share/
Which means now Googlebot be able to crawl the private chat URLs which have format
https://gemini. google. com/share/(some numbers and alphabets)
Then there's a NO INDEX tag implemented on all these pages (as I have explained and showed in the screenshot above)
With robots file Google blocked crawling of /share/ and implemented NO INDEX tag on all of /share/ chat pages.
But for some reason Google was fetching the page URL but crawling of it being blocked by robot it was not able to see NO INDEX tag.
Which means "Indexed, though blocked by robots.txt"
<!--