Archive for April, 2007

Instructions: How to make your mindmap

Monday, April 30th, 2007

First you need to register.

Then you need to log in, inserting your username and your password in the header.

Once you are logged in go to your personal page. You can find it by clicking on [my account], in the header.

Then you need to download your data from del.icio.us. And then save it in your computer.

Now you need to upload your data into Mind My Map. This can be done from your personal page. In the second form. The one that says: upload xml data. Attention: in this way you are releasing your data in the public domain. That is, if you care about it and consider your list of bookmarks a work of creativity.

Assuming you have uploaded your data, you now need to make the mindmap. Go to your personal page. Scroll down until you see a line that looks like:

(Data 36) [Download] [make standard mindmap] All uploaded by Pietro on 24 April 2007 [Forget] [Delete]

Of course the name of who uploaded it, the name of the data, the date, will all be different. :). Click on [make standard mindmap].

If all has gone well so far, if you now go to your personal page, there should be your map waiting for you. Click on [Show] to see it.

If you have any question, please ask it here. And yes, I will try to simplify the process.

tagcloud for the data added

Monday, April 30th, 2007

It is still quite ugly, but it works. And the fact that something works is at this stage the only thing that I care off. I am speaking about the Data page. A page for each of the data, which also holds a tagcloud of all the tags in a particular xml file.

As a base I used Sujit’s code (which I wish to thank in this occasion). Which was ok, although I suspect that a real python programmer would have rewrote everything in half the lines, double the speed. But what really bothered me was how Sujit divided the tagcloud into different sizes. Because he does it in a linear way on the tag popularity. But following the tag a power law the result is quite steep. So I rewrote that part making every set aboout the same size. I am still not satisfied. I know the solution has some logs inside. But not for today. For now it is ok. See the result.

mysqldb, dreamhost

Wednesday, April 25th, 2007

So, the good news is that I found the bug. The bad news is that it cannot be easily taken off. Essentially Dreamhost has installed an old version of MySQLdb, and buggy. This means that all the data once they get stored through python in mysql then are only stored as ascii code. And any non ascii character rises an error. So as a temporary patch I encoded everything as an ascii before sending it. And any unicode character just get replaced with a ? or similar. Amen. I hate it, but for now it is ok. I might try to install mysqldb in a separate version, but that’s in the future. For now a lot of maps which were before crashing are now not crashing anymore.

Debug and todo list

Tuesday, April 24th, 2007

I made a todo list in the form of a mindmap. There are other informations in there, not just about this website. But I limited the info about my job and personal life, and tried to make a fairly complete list of the ideas I am thinking of implementing in the near future on this website.

Today I mainly did some simple debugging. I introduced a couple of filters to make sure that the files uploaded were .xml and .mm. I  know they are not going to stop the bad guys, but they might stop the mindless ones.  I am starting to understand why there are so many warning signs in this society! I also started to work on map 6. I  tried to make a  map of the first 500 posts: no luck. 400: idem; 300: same same; 200: nop; 100: no again; 50: YES! So the first 50 posts could be mapped. I think there must be a couple of different problems with this map. First a character set problem. Second a size problem. I will first try to work on the character set problem. Then I might tackle the size problem. The problem is that if a map has n tags, I need to make n dictonaries of size n. But I suspect that I might simplify the whole thing taking a n*n matrix. And if not, do we really need to map all the tags? I mean the tags follow a power law, of those n tags probably 50 are relevant, and the rest are just the long long tail. Tags with only one entry. Maybe I could, for example, only consider the tags that have more than one post. That might actually work. And the result might even be nicer.

Half way: what a mess

Monday, April 23rd, 2007

I finally managed to change the code, so that now every new map that is created is first transposed to MySQL, and no big data is stored in the RAM. This was needed to permit bigger maps to be calculated. Also for the further versions of the mapmaker. Still the program is less then perfect. With all the work I did, it still did not manage to be able to compute data 6. Also all this use of MySQL is actually slowing down the performance for everybody else. Not clever. In the next days I shall work in adding indexes, and other changes that should make the code faster, and MySQL less subject to the weight of the mapmaker. I also need to look very carefully in the whole topic of the various character sets. Quite a lot of times the map is not being generated because the data are in the wrong character set.

I also noticed that it was not possible if you were not an administrator to make a map from some data that have not been uploaded by you. This was against the spirit of the site, so I corrected it, and now everybody can use everybody else data.

few bugs cleared, and discussion on the future work

Tuesday, April 17th, 2007

I will be out for a few days. In the meantime, I took away a few other bugs (do they actually proliferate?). In particular there was a small nasty that made sure that if you were trying to memorise a map, he would make you memorise the data with the same id, and viceversa. The result of this might have driven a few people away, who knows.

In the meantime we got a new user: user Thomaspoto which  generously inserted his xml data, but then did not went further to actually make the map. Also the previous user followed the same pattern, so now I am starting to be paranoid that it is not clear enough. Maybe I should spell somewhere that after you upload an xml file you still need to click on the [Make Standard Map] button.
In any case I went again in and clicked the button myself. It is also strange that no one seem to makke anybody else mindmap. I would be so curious.

But beside all this I am now working on why I can’t get mindmaps made for systems that are too big. And too big, is really not that big at all. 500 bookmarks or so. After some testing I reached the conclusion that I am not hitting the CPU limit, I am not hitting the length of process limit (I got a process running for 20 seconds before I killed it myself), so I must be hitting the RAM limit. And this is hard, so to walk around it, trying to make the calculation without being to heavy on the RAM right now I am studying how to store a whole xml in mysql. And then interacting very closely with the MySQL. Without ever holding so much data in the RAM all at once. Hopefully, eventually it might work.

In the meantime I am off, treat well the lady (the website).

Pietro

mysqldb installed with easy install… fuck off!

Sunday, April 15th, 2007

I am going further, and to test some .py code in my machine I had to add mysqldb. Then, opening it, he refused to work unless ‘easy install‘ was installed too. They must have been friend. He tried to get his mate by himself and didn’t suceed. So there I was, searching the net to get the second package. That was a .egg file. Now .egg file are in general archives. Not this. I actually had to call “sh thefileIjustdownloaded”. Something that I happen to discover in some obscure page. Now with the easy install installed, I could go one step further to install mysqldb. The next step was discovering that he was not willing to let me install mysqldb unless I had a version of mysql running in loco.  How patient am I! In any case I wanted to have a local mysql, so there I was looking for the right version. Firsl locate mysql.com, then proceed over the terrorist notice: “unless you are a god in programming buy a local guide”. Now what kind of Mac have I got? I know it is not Intel, so it is PowerPc. 32 or 64? Google doesn’t help. Wikipedia neither, andthe About this Mac is similarly obscure. I just know that my cpu is PowerPC G4 (1.1). No reference to 32-bit or 64-bit. It is probably a question too stupid to ask? Let’s try. I download the 64. Install. Run from the System Preferences. It does not start. But he did not protest when I installed it, it MUST be the right type… right? I actually need to add a few things in the PATH. Look he does not even find the mysql. But here they are. Let’s try to run it, you know just for sport.

>WRONG CPU TYPE.

Hmm, looks like I have a 32 bit. Let’s download the 32 bit.  Read Me: unfortunately installing a new version of mysql does not disinstall the previous… Great! So google: how do I unistall a mysql that was wrongly installed (although-I-can’t-understand why-I had to instal-it-in-the-firstplace-since-I-only-wanted-to-connect-to-the-far-away-server). Answer. Worked fine. Except that in the meantime I had alread installed half of the 32 bit version. Let’s delete all and start again. Ok, installed. Now let’s go back to mysqldb.

Same problem, he cannot find “mysql_config”. I use spotlight, I can’t find it either. I use quicksilver (they promised me that it was the best), … not in this. I use google. Only one person had the same problem (and is mirrored all over the place), but the suggestion don’t seem to apply.

Someone suggests: you need to tell the setup where to find mysql_config. Well, how can I tell him if I don’t find it myself. It continues: normally is in …/mysql/bin/

“normally” works fine for me. So I found it, how do I tell him? The error was raised by setup_posix.py. Here you go:

f = popen(”%s –%s” % (mysql_config.path, what))
and no way is  mysql_config.path defined. I change it into:

f = popen(”%s –%s” % (’/usr/local/mysql/bin/mysql_config’, what))

and it works!

python setup.py install
sudo python setup.py install

A long list of things that happened, and it ends with something similar to:

mysqldb installed with easy install.

“…with easy install”: fuck off!

There must be a better way to pass the sunday. In any case now that mysqldb is installed I just discovered I cannot run the code because he cannot connect to the mysql. It must be because of the university proxy. It doesn’t matter, I shall use test it directly on the net.

Why the data should be released in the Public Domain

Saturday, April 14th, 2007

After discussing with a friend of mine I was convinced to require everybody who upload any data in MMM to also release it into the Public Domain. This will permit a few things:

a) In any case any data that is uploaded in Mind My Map can be easily downloaded. Also the xml data can be regenerated from the mindmap. So I really don’t want to give to the user an impression of a security that is not there.

b) I want to use the xml data to calculate the distance between mindmaps as I have described in my English blog. It is already one year and a half that I have described those algorithms and no one has stepped forward to use them. So I will.

c) If the data is in the Public Domain, then anybody can use anybody else data. And the sitebecomes a big toy store. Yeah! Anybody can make maps from anybody else xml data. And for now only standard map are available, but soon…

So I had to work to change the data structure. Now each map and data stores the creator. And then you have a list of the data that you have decided to remember. Creating some data automatically makes you remember it. And the user page now only shows the data you have chosen to remember.

This brings up a few problem:

when should the data be deleted, if ever?

should you ‘remember’ any new map that has been generated from data you have created or remember?

Hmm…

Not to make errors, for now no data gets deleted (I took away the buttons). But eventually I will need to reorganise them.

And we also have the first user who uploaded uncorrupded data. Welcome to janzo. Using his data I made a really pretty map.

From Python to MySQL without passing through php

Thursday, April 12th, 2007

I made some changes in the code. Now the map maker directly connects to python. And directly stores the data to MySQL. I am not sure if it makes such a huge difference in terms of speed, but it surely it does in the complexity of the code and of the structure. Before I would read the data in php, store it as a file, call the py program, write the result file from python. read the file from php and store it in mysql. A mess. Now it is all done through pyhton. This might also open the door toward the idea of running the program in background.
At the moment the speed of the mysql is the bottleneck of the whole service. Only people with few delicious bookmarks can have their map done. I don’t like this, and I have in mind a few ideas to solve this situation. The first one will just be to try to make the calculation run in background, so the user is not forced to keep a windo open the whole time. They might still be killed by DreamHost, though. So it is only moving the bar, not really solving the situation.

Some data deleted, and some code tested.

Wednesday, April 11th, 2007

Some of the information uploaded from the user has been corrupted. Mostly I, without realising how small a BLOB is, have saved it, just to realise that only the first 64 kb were saved. Please try to upload it again, and let me know if now it works better.
I also fixed a few bugs and I passed some time in writing a system that directly connects to delicious. Eventually I want it not to connect only to delicious but to every other bookmarking service that offers the whole set of bookmarks in their API, but for now I shall start with delicious.

A couple of users have tried the site, enough to register. Too bad the website was not working at the time.