Quick question and a request for help on "chart-scraping"

Other discussions not related to the Permanent Portfolio

Moderator: Global Moderator

Post Reply
D1984
Executive Member
Executive Member
Posts: 730
Joined: Tue Aug 16, 2011 7:23 pm

Quick question and a request for help on "chart-scraping"

Post by D1984 » Tue Jun 02, 2020 5:48 pm

Do we have any computer/software/scripting experts on here? I have a quick question/issue:

I am trying to get some daily data on certain mutual funds (including two that were closed/liquidated a few years ago) in order to run a backtest in Excel and PV. CRSP only has monthly data--not that I have access to CRSP anyway....I've got about 20,000 reasons for not having said access and they all have George Washington's picture on them, LOL--so it seems the best source may be Morningstar. If you go to (for instance), say, http://quotes.morningstar.com/chart/fund/chart?t=AIVSX you will see you can get daily TR data back to 1971 and monthly TR data before that. I really on need the data from 1-1-71 to 12-31-1979 because after that I have the daily data (from the start of 1980 onwards all the way to to liquidation/close) I downloaded from Yahoo Finance.

The problem is how to get it from Morningstar without having to manually read it and input it into Excel one day at a time. For a still-extant fund like the above it is no issue; Morningstar's new site lets you download the data in .CSV or .XLS format all the way back to fund inception. The problem is what to do for the two funds that were closed/merged (one in 2016 and one two years or so later). If you go into the abovementioned chart (I use that as an example since AIVSX goes all the way back to 1934 and thus very few funds are older than it) and type in the ticker of any fund closed after roughly mid-2015 you will get a line on the chart also showing that fund's daily/monthly return from inception as well all the way to whenever in 2016 or '17 or '18 or '19 it closed....but again, you will have to view and write down the data for each day manually (and there are over 240 trading days a year).

I am aware that in Firefox or Chrome you can press F12 and open the debugger and get a link that lets you copy it and download the weekly total return data (and monthly before that) in a .txt format but again, this is not the actual daily data shown in the chart itself from 1-1-1971 onwards.

Does anyone know enough about JSON (or whatever kind of scripting is behind these charts) to know of a way to download the above type of data without having to manually record it one day at a time? Thank you.
User avatar
Xan
Administrator
Administrator
Posts: 4392
Joined: Tue Mar 13, 2012 1:51 pm

Re: Quick question and a request for help on "chart-scraping"

Post by Xan » Tue Jun 02, 2020 6:05 pm

It looks like if you pull up the Network tab of the Dev Tools while refreshing the page, one of the requests that gets made is to "defaultChart?type=getcc&secids=%24FOCA%24LB%24%24;CA]FO&dataid=117&startdate=2009-06-03&enddate=2020-06-02&currency=USD&format=1&adjusment=-" with a "plain" type.

Pull up the Response to that request and you get a long string.

If you copy that request, starting with the first "{" and ending with the last "}", then you have JSON data. You can view it in http://jsonviewer.stack.hu/, for example. Drill down a bit and you'll find an entry for each date containing the date and the value.

As for how to convert it to txt, that's a job for your favorite scripting language.
D1984
Executive Member
Executive Member
Posts: 730
Joined: Tue Aug 16, 2011 7:23 pm

Re: Quick question and a request for help on "chart-scraping"

Post by D1984 » Tue Jun 02, 2020 7:28 pm

Xan wrote:
Tue Jun 02, 2020 6:05 pm
It looks like if you pull up the Network tab of the Dev Tools while refreshing the page, one of the requests that gets made is to "defaultChart?type=getcc&secids=%24FOCA%24LB%24%24;CA]FO&dataid=117&startdate=2009-06-03&enddate=2020-06-02&currency=USD&format=1&adjusment=-" with a "plain" type.

Pull up the Response to that request and you get a long string.

If you copy that request, starting with the first "{" and ending with the last "}", then you have JSON data. You can view it in http://jsonviewer.stack.hu/, for example. Drill down a bit and you'll find an entry for each date containing the date and the value.

As for how to convert it to txt, that's a job for your favorite scripting language.
Xan,

Thank you for the very quick reply!

Just a few more questions/clarifications:

One, I'm not a computer expert or programmer or coder at all (I can surf the Internet, use OpenOffice applications, create PDFs, view videos and pictures, etc but beyond that I am kind of lost at sea, so to speak) so I'm not sure what you mean by "pull up the Response to that request"...do you just mean copy the URL and open it in a new tab?

Here's what I've managed to do so far. The URL for both AIVSX (which I used as above because it goes back so far) that also includes the data for one of the funds I want said data on (Putnam Voyager - PVOYX....it was an aggressive growth fund that started in 1969 and liquidated/merged in 2016) is as follows: http://quotes.morningstar.com/chart/fun ... A%5B%5D%7D

I am using Firefox on Windows 10; when I go to the above URL, load it, and then go to "Tools" on the menu bar, then "Web Developer" then "Network" and then refresh the page I get several links similar to the one you suggested (i.e. it starts with "defaultChart?type=getcc"...the one I think is the data for PVOYX is the one with "FOUSA00ER1" in it; its "Cause" is "Script" and its "type" is "Plain"....see link to screenshot I've attached at https://ibb.co/Zd8CRxK ). When I try to copy said link or manually paste it into another window it turns it into a URL as follows: http://mschart.morningstar.com/chartweb ... 1142165945

Which brings up a tab with the monthly data pre-1971 and then weekly (not daily) data after that...which is what I already have.

Also, when I try to paste the above link (the one that starts with "http://mschart.morningstar.com/chartweb ... type=getcc" into http://jsonviewer.stack.hu/ in the "Text" part and load it the site keeps giving me an "Invalid JSON Error" message when I try to load it and go to the Viewer section.

Can you please help me figure out exactly where I'm going wrong? Thanks!

EDIT: I just tried all of the above with the URL http://quotes.morningstar.com/chart/fun ... arks%22%3A[%22%22%2C%22%22]%2C%22chartType%22%3A%22growth%22%2C%22startDay%22%3A%2212%2F30%2F1970%22%2C%22endDay%22%3A%2208%2F31%2F1971%22%2C%22chartWidth%22%3A955%2C%22SMA%22%3A[]}

in order to see if maybe only showing it past the date when daily data started (i..e past 1-1-971; the above URL is for 12-30-70- to 8-31-71) would work differently and I got the same result as above....the result goes back to PVOYX's April 1969 start date and goes all the way to its 2016 ending (instead of starting from 1-1-71 onwards) but again, only weekly data.


EDIT 2: OK, I think I may have got something (hopefully). If I open up the URL I mentioned above ( http://mschart.morningstar.com/chartweb ... 1142165945) and then click the very first "{ "symbol (the one before the words "status :{"code":" ) and then scroll all the way down to the very bottom of the page and click the very last } symbol (i.e. the one right before the ); ) while using SHIFT to select the whole thing, and then I copy it as plain text, and then I paste it into the "Text" section of the JSON Viewer website and then go to the "Viewer" section it will (after it takes 15 or 20 seconds to load it) load a bunch of "breadcrumbs" or "submenus" or whatever one wishes to call them that look like the following screenshot ( https://ibb.co/hDq6s3d ). Every one of those numbers in the submenu (from 0 at the top all the way to 17370 at the very bottom) will be the daily data value for a day (corresponding to 4-1-1969 for "0" and 10-21-2016 for 17370) as per the following screenshot ( https://ibb.co/zWdtKsD ). Am I correct?

If I am correct (which I may very well not be and be dead wrong) then what script or software would be your personal recommendation that I use to download/convert it to a plain text file or .CSV file? Again, any help with this is very much appreciated; thank you!

EDIT 3: Well, I found out a pretty easy way to convert it to CSV (I think). A nifty little program called SaveJson2CSV. What I did was DL'd said program, downloaded the entire file from the actual mschart URL (i.e. saved it as a plain.txt file) renamed it to .json from .txt, loaded it into SaveJson2CSV, chose the "Execute" button, and within 10 seconds it spit out a .CSV file as pretty as you please (and with the data in rows too...all the online converters I tried put it into columns which meant that it was too long for either Excel or Openoffice to load fully) with daily data from inception to liquidation.

Thanks again for getting me started on the right track with this.
User avatar
Xan
Administrator
Administrator
Posts: 4392
Joined: Tue Mar 13, 2012 1:51 pm

Re: Quick question and a request for help on "chart-scraping"

Post by Xan » Tue Jun 02, 2020 8:25 pm

D1984,

In your Edit 2 you indeed found the raw information to which I was trying to direct you!

The way to parse the data out of what you've got involves writing a script or program. I don't think there's going to be a general-purpose tool that will do it out of the box. I would implement it in Perl, but it's also possible in just about any language. It would be a pretty quick thing, but it doesn't sound like that's a skill in your toolbox.

Javascript may be the best choice here, so that you can run it with nothing other than a browser. It could be embedded in an HTML file which would prompt you to paste in the raw data from Morningstar.

I don't know when I might have a chance to play with it, though.
D1984
Executive Member
Executive Member
Posts: 730
Joined: Tue Aug 16, 2011 7:23 pm

Re: Quick question and a request for help on "chart-scraping"

Post by D1984 » Tue Jun 02, 2020 9:34 pm

Xan wrote:
Tue Jun 02, 2020 8:25 pm
D1984,

In your Edit 2 you indeed found the raw information to which I was trying to direct you!

The way to parse the data out of what you've got involves writing a script or program. I don't think there's going to be a general-purpose tool that will do it out of the box. I would implement it in Perl, but it's also possible in just about any language. It would be a pretty quick thing, but it doesn't sound like that's a skill in your toolbox.

Javascript may be the best choice here, so that you can run it with nothing other than a browser. It could be embedded in an HTML file which would prompt you to paste in the raw data from Morningstar.

I don't know when I might have a chance to play with it, though.
Hi Xan,

If you get a chance in the next month or two and you want to write such a script go ahead but please don't feel obligated to. Like I said, you've already been quite helpful with this and I don't want to unduly burden your time. Besides, I've tried the program I mentioned in Edit 3 on two different funds and in both cases it gave me a CSV (which I crosschecked for accuracy by picking several random months from Morningstar for each fund and seeing if the actual visual line graph in Morningstar matched with the values/return on the .CSV for said dates/months; they in fact did match) which was just what I was looking for. I have to be up earlier than usual tomorrow morning so I'm about to hit the sack right now but I'll try it out tomorrow with several more now-liquidated funds (including two that HB mentioned as being volatile enough to be suited to the stock portion of the PP....this was before he switched to recommending index funds for the 25% stock portion) that MStar still shows data for.

Thanks again for your assistance with this and listening to/answering what to you probably seemed like utterly noob-level questions.
User avatar
Xan
Administrator
Administrator
Posts: 4392
Joined: Tue Mar 13, 2012 1:51 pm

Re: Quick question and a request for help on "chart-scraping"

Post by Xan » Tue Jun 02, 2020 9:55 pm

I hadn't seen your Edit 3 until seeing your more recent post. I'm very glad you found something that worked!
Post Reply