Sharing Top Content from the Angular-sphere.

📦 clean-mark: Convert Web Articles Into Clean Markdown Text – #reactjs #react #javascript…

  • For example, this article: – – Is converted into this text file: – – The article will be automatically named using the URL path name.
  • The file type can be specified: – – The available types are: HTML, TEXT and Markdown.
  • The output file and path can be also specified: – – In that case the output will be .
  • On some websites, the text, or links are cut from the article.
  • In this case, you have to manually edit the resulted text, – – raise an issue on A-Extractor with the link that doesn’t work and I’ll add it in the database, so that next time, the text will be extracted correctly.

clean-mark – Convert an article into a clean text

GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.

For example, this article:

Is converted into this text file:

The file type can be specified:

The available types are: HTML, TEXT and Markdown.

The output file and path can be also specified:

. The extension is added automatically.

Simply install with yarn:

Or with npm:

This project depends on the A-Extractor project, a database of expressions used for extracting content from blogs and articles.

Clean-mark was tested on all major news sites. On some websites, the text, or links are cut from the article. In this case, you have to manually edit the resulted text,

raise an issue on A-Extractor with the link that doesn’t work and I’ll add it in the database, so that next time, the text will be extracted correctly.

The desired goals are:

GitHub

Comments are closed, but trackbacks and pingbacks are open.