[mdlug] [OT] Yahoo web server detects my java bot HTTP request
R KANNAN
rk111810 at gmail.com
Sun Feb 5 09:17:46 EST 2012
Hello,
I have a java code that makes HTTP requests to finance.yahoo.com and parses
the returned page for net asset value and write to a text file in a format
that can be read into Quicken. So basically it is a crude way of getting
stock prices.
It has been working fine since 2004 with a few tweaks for search strings
whenever yahoo makes a change in their web pages.
It stopped working on Wed (2/1) and I thought it is a matter changing
search string to fix it. But on debugging the code, I found that the web
server was not returning most of the page when requested from the Java
code. It was returning HTTP comments like
<!--> robot0 <-->
in web page segments where the price was supposed to be, whereas the page
source from the browser looks fine.
I switched to money.cnn.com which doesn't detect my bot and it seems to
work for now.
I am just curious how the webserver detected that the request (perhaps it
was looking for browser properties: Firefox/IE/Opera , version etc.) and
how it can be included in my HTTP requests.
Thanks
More information about the mdlug
mailing list