Reading the content of a webpage

It could be funny to creating some kind of visualizing of a website and it’s connections.

Has anyone tried that?

I trought I could load the content using http.request, but I can’t get it to load any content from eg. but I can load

Is there a way to receive the HTML code of a web page?

Thanks :slight_smile:

Hi @Jorgensen,

You should be able to read the html from any *.htm or *.html page as these are simple text files that can be transferred by the server. This is not the case with php - the files have to be interpreted by the php processor and posted to you as an htm/html page.

So choose another page.

Why pick on a php page? If you are looking for php program tips there are plenty on the net.



I know php and use it. I thought I would use my own page for testing.

So it’s not possible to read a index.php page? I thought that a php file gets convert to HTML by the server. I’ll try a other page.

I just tried using but it does not load anything :-/ if I would track links on a page lots of the links properly link to php or asp pages…

Hi @Jorgensen,

There is code in the forum for http capture of web pages. I have used it with my own web page (which is php based) and it only passes on the HTML as expected. I only had problems downloading when I had a php error - when nothing downloaded.



Ok. I might misunderstand it/you.

If i would like to visualize the connections between webpages, I need to read a webpage and read the links from that webpage. And those links can link to index.php index.html or whatever :slight_smile: I won’t know.

And I don’t get it, why it should be an issue at all, as php is run at server-side. Isn’t only HTML that the normal browser receive?

Ahhhh I se now - if I request index.php, I request the content it self - not the server side passed content.

So the real solution would be reading not - but it is not possible to read right?

Could have been fun to view somekind of visualization of a Internet map :slight_smile:

Hi @Jorgensen,

The problem is - when you request the *.php file by HTTP the server knows not to send the file directly to you. The server first sends the *.php file to the php interpreter which then constructs the web page which is returned to you.

This resultant web page only has HTML code in it - but it will have links - some of these may be to other *.php pages.

If you have your own web page, using index.php or similar, you know what code is in your own page. Use one of the web browser development extensions (most browsers now have them built in or available as add-ons) and look at the code that has been sent to you.

Hope that helps - this system is designed to protect your web pages so that no-one can access your personal data.