Problems with data extraction from web pages were analysed, a proposed solution is provided in the paper. Analysis showed that data-based algorithms are more popular than path-based data extraction. We propose a new data retrieval algorithm based on web page data similarity to controlled data.
The efficiency of the proposed data retrieval algorithm was applied to the retrieval of currency exchange rates data, the efficiency of this algorithm prototype was evaluated by comparing it to other products. Research showed that the proposed data retrieval algorithm, although more suitable for the retrieval of constantly changing data and requires controlled data, is more efficient than other similar products.