GoogleはサイトのインデックスにURLを表示していますが、その結果のキャッシュは完全に異なっています

Question

OK.

説明：

ウェブサイト用のランダム関数を作成しましたdocur.co

この関数は、次のリクエストで開始されます。

http://docur.co/random

ロボットは次のURLからブロックされています。

http://docur.co/robots.txt

ただし、GoogleはこのURLに従って、次の検索結果を生成しました。

これはキャッシュです：

私の質問は、ここで何が起こっているのか、誰にも教えてもらえますか？前述のように、私は何か間違ったことをしたかもしれません...

更新：

Re = "nofollow"をrobotsディレクティブの上にあるアンカーに直接追加すると、GoogleがURLをたどらないようにすることができますか？

Chris Rutherfurd · Answer

Robots.txtファイルにエラーがあります。

行11にはAllow：/があります。robots.txtファイルは、許可できるファイルとディレクトリを示しておらず、許可できないもののみを示しています。 robots.txtファイルでサポートされるコマンドは、「User-agent」と「Disallow」のみです。

Disallow：/ randomコマンドは無効なコマンドの後にあるため、Google Searchbotが無効なコマンドを検出した可能性があり、処理できなかったため、robots.txtファイル全体が存在しないかのように処理を停止した.

http://tool.motoricerca.info/robots-checker.phtml にあるようなツールを使用して、robots.txtファイルを検証できます。

キャッシュされたバージョンがライブバージョンとキャッシュされたバージョンと異なる理由については、クモが通過した時点でGoogleが見るもので、キャッシュリンクの場合は2016年4月6日16:05:27 GMTでした。

使用できるrobots.txtファイルの新しいバージョンは...

#The date is August 29th, 1997. #Robots have taken over the world and documentaries cease to be created by humans. #what will happen next? #Want to join the Docur team? #E-mail jonbonsilver\//at\//gmail\//dot\//com #Full access for the internet archive. User-agent: ia_archiver Disallow: /random #Every robot that honours the robots.txt standard: User-agent: * #Request file from Docur once every second: Crawl-delay: 1 #Disallowed urls: #Lets not send bots on a random documentary mission: Disallow: /random Disallow: /new-documentaries #Above is a temp line due to indexing problems. Disallow: /?page Disallow: /live-search Disallow: /vote Disallow: /favourite Disallow: /watch-later Disallow: /save-list Disallow: /comment Disallow: /commentlike Disallow: /commentdislike Disallow: /add-review Disallow: /submit-review Disallow: /add-to/* Disallow: /post-list Disallow: /edit-list Disallow: /documentary-search Disallow: /new-list-item Disallow: /settings Disallow: /notificationread Disallow: /documentary/*/l Disallow: */newest Disallow: */oldest Disallow: */highest Disallow: */lowest