Skip to content

Reduce redundant disallow rules in robots.txt #24

@CMSworker

Description

@CMSworker

I thought this problem was specific to the Contao CMS, which uses this package to generate the robotos.txt, but I was asked to report it here instead so that everyone using it can benefit from a fix. (Related Contao Issue#7742 and rejected Contao PR#7743)

I'm not quite sure how to explain it in a general way but maybe like this:

When there is a record with only the directive disallow:/ (just this one line) and, for example, another EventListener adds other disallow directives (or vice versa), it ends up with redundant disallow rules in the robots.txt, for example like

User-agent: AnyFancyBotName
disallow:/
disallow:/somefolder/
disallow:/anotherfolder/

The additional disallow rules are not necessary, as disallow:/ already prohibits access to them. So ideally this package should check if there is a general disallow:/ directive and if so, remove other disallow: rules from this record when the robots.txt gets rendered.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions