Package: polite 0.1.3
polite: Be Nice on the Web
Be responsible when scraping data from websites by following polite principles: introduce yourself, ask for permission, take slowly and never ask twice.
Authors:
polite_0.1.3.tar.gz
polite_0.1.3.zip(r-4.5)polite_0.1.3.zip(r-4.4)polite_0.1.3.zip(r-4.3)
polite_0.1.3.tgz(r-4.4-any)polite_0.1.3.tgz(r-4.3-any)
polite_0.1.3.tar.gz(r-4.5-noble)polite_0.1.3.tar.gz(r-4.4-noble)
polite_0.1.3.tgz(r-4.4-emscripten)polite_0.1.3.tgz(r-4.3-emscripten)
polite.pdf |polite.html✨
polite/json (API)
NEWS
# Install 'polite' in R: |
install.packages('polite', repos = c('https://dmi3kno.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/dmi3kno/polite/issues
crawlermemoiserate-limiterrobotstxtrvestscraperwebscraping
Last updated 1 years agofrom:8cb4893e4d. Checks:OK: 6 ERROR: 1. Indexed: yes.
Target | Result | Date |
---|---|---|
Doc / Vignettes | OK | Oct 29 2024 |
R-4.5-win | ERROR | Oct 29 2024 |
R-4.5-linux | OK | Oct 29 2024 |
R-4.4-win | OK | Oct 29 2024 |
R-4.4-mac | OK | Oct 29 2024 |
R-4.3-win | OK | Oct 29 2024 |
R-4.3-mac | OK | Oct 29 2024 |
Exports:%>%bowguess_basenamehtml_attrs_dfris.politenodpolitelyripscrapeset_rip_delayset_scrape_delayuse_manners
Dependencies:askpassassertthatcachemclicliprcodetoolscrayoncredentialscurldescdigestfansifastmapfsfuturefuture.applygertghgitcredsglobalsgluehttrhttr2inijsonlitelifecyclelistenvmagrittrmemoisemimeopensslparallellypillarpkgconfigpurrrR6rappdirsratelimitrRcpprlangrobotstxtrprojrootrstudioapirvestselectrspiderbarstringistringrsystibbleusethisutf8vctrswhiskerwithrxml2yamlzip
Readme and manuals
Help Manual
Help page | Topics |
---|---|
Introduce yourself to the host | bow is.polite |
Guess download file name from the URL | guess_basename |
Convert collection of html nodes into data frame | html_attrs_dfr |
Agree modification of session path with the host | nod |
Give your web-scraping function good manners polite | politely |
Print host introduction object | print.polite |
Polite file download | rip |
Scrape the content of authorized page/API | scrape |
Reset scraping/ripping rate limit | set_rip_delay set_scrape_delay |
Use manners in your own package or script | use_manners |