Web Archiving
Rank | App | Description | Tags | Stars |
---|---|---|---|---|
1 | ArchiveBox/ArchiveBox | 🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more... | self-hosted archivebox backups bookmark-archiver browser-bookmarks chromium digipres firefox headless-browser internet-archiving pinboard pocket python rss singlefile warc wayback-machine web-archiving wget youtube-dl | 19871 |
2 | Kovah/LinkAce | LinkAce is a self-hosted archive to collect links of your favorite websites. | self-hosted docker selfhosted php archive bookmark-manager bookmarks laravel archiving bookmark-managers bookmarking | 2434 |
3 | lcomplete/huntly | Huntly, information management tool, rss reader, automatic saving browsed contents include tweets, github stars management tool. 信息管理工具、RSS 阅读器、GitHub stars 管理、推文管理、自动记录浏览过的文章。 | react selfhosted pocket rss twitter github rssreader | 1809 |
4 | kanishka-linux/reminiscence | Self-Hosted Bookmark And Archive Manager | self-hosted selfhosted django archive bookmark-manager bookmarks bookmark | 1712 |
Web Archiving
Web Archiving is the process of capturing and preserving websites or parts of websites for future reference or research. It is a crucial tool for historians, researchers, and anyone interested in preserving the history of the internet.
There are a number of open source self hosted apps that can be used for web archiving. These apps allow you to capture and preserve websites or parts of websites, and then access them later, even if the original website is no longer available.
Some of the most popular open source self hosted web archiving apps include:
- WARC (Web ARChive) is a file format that is used to store web archives. WARC files can contain the entire website, or just parts of it, such as the HTML, CSS, and images.
- Wget is a command-line tool that can be used to download websites. Wget can be used to download entire websites, or just specific files.
- HTTrack is a graphical user interface (GUI) tool that can be used to download websites. HTTrack can be used to download entire websites, or just specific files.
- WebCopy is a web browser extension that can be used to capture and preserve websites. WebCopy can be used to capture the entire website, or just specific pages.
Web archiving is a valuable tool for preserving the history of the internet. By using open source self hosted web archiving apps, you can help to ensure that future generations will have access to the websites and information that are important to them.