WEBcoast Logo

New crawler for TYPO3 CMS

Big TYPO3 sites nowadays mostly use Solr for adding search functionality. To setup Solr and the schema and configuring the crawling it not a really simple task. Additionally Solr uses a huge amount of especially memory.

This is the reason why small and mid-size sites still use indexed_search. Unfortunately until now there haven't been a fully compatible crawler extension for TYPO3 CMS 8. The existing extension crawler, that was used mostly in connection with indexed_search, has only been poorly fixed for CMS 8 and did not work as expected after 2 hours of trying and debugging. Furthermore it seems to me, that it is not really development anymore.

For this reasons I decided to start completely from scratch to approach the topic of crawling in TYPO3 CMS. I was focused on fully compatibility with CMS 8. Moreover it should be easily usable and easy to setup and at the same time give developers enough opportunities to add eventually missing functionality.

The result is the extension versatile_crawler. The name suggest high flexibility. In the first version this means, that it is quite easy to add new crawler types. In the next version it will be possible to extend or replace the current indexer, that is fitted to indexed_search right now, with custom indexers. Thereby it would e.g. be possible to post the data into a Solr index.

The extension can be found in the TER and on Packagist. The source code is hosted on GitHub. There you can also find the Wiki containing the documentation, that will in the future be extended with information for developers on how to extend the extension.
I hope, you find this extension useful and I'm looking forward to get some feedback.