htmltab: Assemble Data Frames from HTML Tables

HTML tables are a valuable data source but extracting and recasting these data into a useful format can be tedious. This package allows to collect structured information from HTML tables. It is similar to 'readHTMLTable()' of the XML package but provides three major advantages. First, the function automatically expands row and column spans in the header and body cells. Second, users are given more control over the identification of header and body rows which will end up in the R table, including semantic header information that appear throughout the body. Third, the function preprocesses table code, corrects common types of malformations, removes unneeded parts and so helps to alleviate the need for tedious post-processing.

Version: 0.8.2
Depends: R (≥ 3.0.0)
Imports: XML (≥ 3.98.1.3), httr (≥ 1.0.0)
Suggests: testthat, knitr, tidyr, rmarkdown, spelling
Published: 2021-09-16
Author: Christian Rubba [aut], Gerhard Burger ORCID iD [ctb, cre], Roman Cheplyaka [ctb]
Maintainer: Gerhard Burger <burger.ga at gmail.com>
BugReports: https://github.com/htmltab/htmltab/issues
License: MIT + file LICENSE
URL: https://github.com/htmltab/htmltab
NeedsCompilation: no
Language: en-US
Materials: README NEWS
CRAN checks: htmltab results

Documentation:

Reference manual: htmltab.pdf
Vignettes: Hassle-free HTML tables with htmltab

Downloads:

Package source: htmltab_0.8.2.tar.gz
Windows binaries: r-devel: htmltab_0.8.2.zip, r-release: htmltab_0.8.2.zip, r-oldrel: htmltab_0.8.2.zip
macOS binaries: r-release (arm64): htmltab_0.8.2.tgz, r-oldrel (arm64): htmltab_0.8.2.tgz, r-release (x86_64): htmltab_0.8.2.tgz, r-oldrel (x86_64): htmltab_0.8.2.tgz
Old sources: htmltab archive

Linking:

Please use the canonical form https://CRAN.R-project.org/package=htmltab to link to this page.