Provided by: wget2-dev_2.1.0-2.1build2_amd64
NAME
libwget-robots - Robots Exclusion file parser
SYNOPSIS
Data Structures struct wget_robots_st Functions int wget_robots_parse (wget_robots **_robots, const char *data, const char *client) void wget_robots_free (wget_robots **robots) int wget_robots_get_path_count (wget_robots *robots) wget_string * wget_robots_get_path (wget_robots *robots, int index) int wget_robots_get_sitemap_count (wget_robots *robots) const char * wget_robots_get_sitemap (wget_robots *robots, int index)
Detailed Description
The purpose of this set of functions is to parse a Robots Exclusion Standard file into a data structure for easy access.
Function Documentation
int wget_robots_parse (wget_robots ** _robots, const char * data, const char * client) Parameters data Memory with robots.txt content (with trailing 0-byte) client Name of the client / user-agent Returns Return an allocated wget_robots structure or NULL on error The function parses the robots.txt data and returns a ROBOTS structure including a list of the disallowed paths and including a list of the sitemap files. The ROBOTS structure has to be freed by calling wget_robots_free(). void wget_robots_free (wget_robots ** robots) Parameters robots Pointer to Pointer to wget_robots structure wget_robots_free() free's the formerly allocated wget_robots structure. int wget_robots_get_path_count (wget_robots * robots) Parameters robots Pointer to instance of wget_robots Returns Returns the number of paths listed in robots wget_string * wget_robots_get_path (wget_robots * robots, int index) Parameters robots Pointer to instance of wget_robots index Index of the wanted path Returns Returns the path at index or NULL int wget_robots_get_sitemap_count (wget_robots * robots) Parameters robots Pointer to instance of wget_robots Returns Returns the number of sitemaps listed in robots const char * wget_robots_get_sitemap (wget_robots * robots, int index) Parameters robots Pointer to instance of wget_robots index Index of the wanted sitemap URL Returns Returns the sitemap URL at index or NULL
Author
Generated automatically by Doxygen for wget2 from the source code.