HTML Scraping from Navision

Some times you are in the need of showing a web page an afterwards using the results or data from the web page in Navision.

This can be done by using HTML Scraping. HTML Scraping is a technique which allows you to extract data from a web page.

HTML Scraping contains of 3 elementary steps:

  • First open a browser and find the page from where data should be extracted
  • Second extract the HTML page and find the tag which contains the data – and extract it
  • Third close the browser

Let’s take a closer look on each of the mentioned steps.

To open a browser use the automation ‘Microsoft Internet Controls’.InternetExplorer

IF ISCLEAR(iExplorer) THEN
   CREATE(iExplorer);
 
iExplorer.Visible(TRUE); //Show the browser window
 
iExplorer.Navigate('http://www.byllemos.com');

Where iExplorer is defined as ‘Microsoft Internet Controls’.InternetExplorer with the parameter WithEvent set to Yes.

If you do not want the browser window to be shown for the user, then set Visble to false.

Now its time for the HTML Scraping :-) To do this, we add code on the iExplorer::DocumentComplete trigger. This is done, because we only want to scrape complete pages.

First you have to load the document:

//Load the document
HTMLdocument := iExplorer.Document();

Where HTMLdocument is ‘Microsoft HTML Object Library’.HTMLDocument

Now you are able to parse the document and get data from it:

//Get a specific tag
HTMLElementCollection := HTMLdocument.getElementsByTagName('Title');
 
//select a single element
HTMLElement := HTMLElementCollection.item;
 
//get the value
returnvalue := HTMLElement.innerText;

Where HTMLElementCollection is ‘Microsoft HTML Object Library’.IHTMLElementCollection and HTMLElement is ‘Microsoft HTML Object Library’.HTMLHtmlElement

Now you have scraped the page and can close the browser:

iExplorer.Quit;
 
CLEAR(iExplorer);

Before quitting the browser though – you must be sure, that the page loading is completed.
One way to do this is by waiting for the browser not to be busy:

WHILE iExplorer.Busy() DO BEGIN
END;

While loading a web page – the browser will tell that it is busy.

If you are showing the web page for the user, you also have to wait for the user to close/quit the browser – or else you will run the risk for a too early end of your codeunit. This can be done by waiting for the visible property to change:

WHILE iExplorer.Visible() DO BEGIN
END; 

That’s all – now you are able to perform HTML Scarping :-)

You can leave a response, or trackback from your own site.

Leave a Reply


nine − = 1