certutil and cert8.db in Firefox

Using certutil to print cert8.db

This is a small post to explain how to use certutil and cert8.db. This is especially useful if you run into the error certutil: function failed: SEC_ERROR_LEGACY_DATABASE: The certificate/key database is in an old, unsupported format.

  1. What is the cert8.db?
    1. cert8.db is the certificate store for Firefox. It used to be called cert7.db earlier, but the latest versions of Firefox store the root certificates (and other certificates) in this file
  2. Why does this matter?
    1. This file is similar to the keystore on the Windows machine which stores the SSL certificates for the Windows machine. As Firefox is cross-platform it stores the certificates in its own file (much like Java does with jks)
  3. Why do I care about the cert8.db?
    1. You can query this file to get the list of certificates that are part of Firefox
  4. How do I install this?
    1. On Ubuntu machines, you can do this sudo apt-get install libnss3-tools
    2. On Windows machines, you can download the certutil.exe from here
    3. For Windows, you can also check this SUMO link
  5. So, how do I query the cert8.db?
    1. Copy the cert8.db from your Firefox profile into some directory. Your Firefox profile is in ~/.mozilla/firefox/<randomstring>.<profilename> (and typically %APPDATA%\Mozilla\Firefox\Profiles, though you can change it too). Say you copied the file into ~/code/tmp
    2. Then you open a terminal window and cd to ~/code
    3. Now type certutil -L -d tmp
    4. This will list all the certificates in the cert8.db that is in tmp directory
  6. So, as you noted, you don’t query the cert8.db file itself, rather the directory that the cert8.db file is in
  7. The above command will list all the root certificates within the cert8.db
  8. If you want to print the complete certificate chain of any one certificate, say DigiCert High Assurance EV CA-1
    1. certutil -L -n "DigiCert High Assurance EV CA-1" -d tmp

Hope this is useful for you to check the usage of certutil

Mozilla projects list – V1

List of all Mozilla projects, documented from wiki.mozilla.org

I was checking on Mozilla Wiki to see if there is a directory of all Mozilla projects and the stages that they are in. While the homepage of the Wiki has the most important projects, there were some good projects which were part of the respective projects. While this is good in that there is lesser clutter, this becomes difficult to market certain projects to get contributors – to develop / QA / support / write good documentation for.

So, I thought, why not create this tree. My focus was specifically on the various products and projects that Mozilla was doing, with a product focus. I didn’t focus on the marketing, sales, partnership, legal teams that Mozilla has. This is not to say that they will not be included in the next version. I appreciate feedback on this chart and based on that I will create the V2 with more details.

And coming to the chart itself, there are a couple of notes

  1. I am a newbie web-developer (not a newbie developer though !). So, if the HTML is not right, then please let me know and I shall learn the right way and fix it
  2. The code is a shameless copy of Mike Bostock’s collapsible tree example using D3
    1. If there are any license violations, please let me know and I shall remove the code
  3. I found out about this from James Westgate’s reply on SO
  4. The changes I did are
    1. Created the JSON data for the Mozilla projects. You can get the JSON file here
    2. Figured out (again thanks Mike), that I need to create a iFrame that will include the HTML for the visualization as part of the HTML page
    3. The iFrame has been modified to make the background opaque as the width of the complete visualization was larger than the width for the container of the blog post – that is why you will see the iFrame in 90% opacity
    4. Needless to say, you will need to enable Javascript on the page and have to allow the D3 JS site

Please let me know if I have made any mistakes or if you have suggestions on how this can be made better. A couple of enhancements I am thinking, for the next version

  1. Having a bit more meta-data in the nodes – possibly the wiki link
  2. Include active/inactive projects
  3. Provide weak-links amongst projects so that one can visualize the various links amongst the projects
  4. Learn D3 🙂

Development environment setup for Lisp

I was reading on Lisp as a development for a very long time (Paul Graham’s writing definitely played a factor, and also a couple of other things). I even did start on Land of Lisp but never could finish it, and Lisp fell off the radar.

I wanted to get back to this (more later) and had to setup a dev environment that would help me learn more. Even though clisp was an option, I didn’t want to rely on gedit and clisp on the command line. That didn’t feel right.

So, here is the account of how I went about setting up Lisp on a Linux Mint (Ubuntu would do just fine). If you are looking for instructions for Windows, most of these might work – YMMV.

  1. Based on this discussion on SO, I decided to use the ClozureCL. Even though SBCL and ClzureCL seem to be able to generate native binaries, I thought that the native support of threads is something that will be helpful in the longer run. Ideally though, it would have been nicer if there is no difference between the various LISP flavors
  2. The next step was to download the CCL on the Linux machine. Again, I preferred setting up everything on a Linux machine, just to avoid myself the pain of getting things working. The main idea is to write the Lisp code, not worry about the setup (though the latter took a lot of time too)
  3. Download the CCL from Clozure FTP  – got the 64bit version, since I was running a Linux Mint Nadia distro on a laptop and Linux Mint Maya on a VM.
  4. The installation required unzipping the file and that was very simple. Another change I did was to the $CCLDIR/scripts/ccl and $CCLDIR/scripts/ccl64 files, providing the directory for the CCL_DEFAULT_DIRECTORY
  5. Now this was the CCL setup and the next thing was – well, setting up the IDE. After a searching for a while, realized that the best suggested IDE seemed to be emacs. I have been trying to get my head around emacs (have been a intermediate vim user for a while, though haven’t done much for a while now). So, I thought, well, may be this is a good time for me to start on emacs
  6. emacs was already installed on my machine, so I went through the in-built tutorial. Again, based on comments, this seemed to be the best way of learning emacs. After crossing around 35% of the tutorial, I felt fairly ok to *start* on using emacs. Mind you, at this stage, I knew to navigate, a bit of yank and kill, and some keys that I remembered. Nothing fancy
  7. Then I tried to ‘execute’ the lisp code. The first function was rather simple (defun myfn() (+ 2 3)) . When I tried to do this from the standard emacs, I couldn’t figure out how this could be done. The standard keys of M-x run-lisp didn’t work straight out of the box since I was not using clisp and the lisp was not installed in the default location. Trying to modify the ~/.emacs file by trying something like this –  didn’t help either. The inferior-lisp-program variable didn’t help it – the ccl wouldn’t execute. Or I would see warnings of swank that I didn’t understand.
  8. At this stage, I was very frustrated and thought that SLIME might be the only option out. Checking this  from SLIME documentation, made me think that SLIME might be the one I need to check.
  9. So, next thing, installing SLIME. This meant, installing SLIME and configuring it for CCL Clozure’s trac  is the location for the instructions.
  10. First download the beta-version of QuickLisp. QuickLisp is the apt-get / Maven / pypi / CPAN of Lisp. So, next thing to do, first download the quicklisp.lisp file curl -O http://beta.quicklisp.org/quicklisp.lisp
  11. Then load the quicklisp.lisp into CCL. Using ccl64 -l quicklisp.lisp
  12. Then call the install method (quicklisp-quickstart:install)
  13. This will download the required files and a directory is created here ~/quicklisp with the required modules
  14. Then (ql:quickload :quicklisp-slime-helper) on the same CCL session. This will install the SLIME modules for the CCL
  15. Then the ~/.emacs file has to be changed as mentioned in the http://trac.clozure.com/ccl/wiki/InstallingSlime. I provided the location of the ccl64 script file for the inferior-lisp-program variable.Also, instead of providing the location for the SLIME installation, provide the location of the slime-helper.el ;;(add-to-list ‘load-path “~/code/slime/”) ;or wherever you put it (load (expand-file-name "~/quicklisp/slime-helper.el")) (setq inferior-lisp-program "~/code/tools/ccl/scripts/ccl64 -K utf-8" ) More information on configuring SLIME is on SO
  16. Then the next step is to load SLIME. Before that, open the file that you want to compile. C-x C-f ~/code/test1.lisp
  17. Then load SLIME. M-x slime. If all is well, you should be prompted with a CL-USER> REPL prompt. You can test that the CCL is working by doing some simple function calls (defun myfn() (+ 2 3)) and calling this function (myfn), should do the trick.
  18.  Then the next thing to do is to be able to ‘compile’ the existing files loaded in the editor. On the file buffer, you can compile the file by typing C-c C-k. If the file doesn’t have any errors, you will see a message ‘compilation finished’
  19. The nice bit of this is that the SLIME gets the latest definitions if the file has been modified. So, you can switch to the SLIME buffer, C-x b, will switch to the last used buffer and you can call the function name to see that the new changes are picked up or not.
  20. And btw, to exit the lisp REPl, the function to use is (quit)

Hopefully the above instructions help you setup emacs, Clozure CL, SLIME and get you started on writing some Lisp code. I am still learning emacs, and a tip here – start with the inbuilt tutorial. It is a bit long, but it will be worth the time. I will publish my cheat-code for the emacs keys in the future some time.

Restarting the writing?

Am writing after a very long break – a break in which finding the mental energy to write a blog post was tough to come by. A lot of things have happened in the personal life – something I generally avoid writing on the blog. Suffice to say that I am amazed at what I pulled off. All thanks to my family.

The last year also saw me travel a bit more than usual – something I love to do. Nothing like slinging a backpack, taking a map and walking around (walking – yes).

My geographic location has changed, my work location has changed, what I do at work has changed, my relationship status has changed, the number of wheels I own has changed, my waist-line has changed. Quite a few changes ! A couple of things have remained constant though – am still not very good at staying in touch, am still a big fan of filter coffee, still have a couple of close friends (go to point 1 please), still read books (err..what?), still trying to find ways to ‘switch-off’ from work.

In the last year, I did read quite a few books and I am not going to write my usual reviews of them. Just a list of them and what I think about them

  1. A mathematician plays the stock market – 6/10. Just because he explains puts/shorts nicely. Otherwise, mostly rambling about WorldCom
  2. Making the world work better – corporate marketing at its best. A 3/10. Didn’t purchase this book though, a freebie kindly given to me by IBMers at Gartner 2011
  3. The man who knew infinity – 6/10. An autobiography of Ramanujan – and not much maths in it. Yes, it is indeed possible !
  4. Who says elephants can’t dance – 7/10. Clear, concise, frank narrative – especially relevant if you are trying to revamp a large organization

And what about what is happening now

  1. I work with an amazingly driven team – and am working very hard to ensure that we hire the right kind of people – which is not an easy task, BTW.
  2. Am learning more about how to build a great team – am making mistakes (hopefully not bad ones) and learning.

    Took quite a fall didn’t we master Bruce. And why do we fall sir? So that we might learn to pick ourselves up

Well,  I think this is enough for a restart post. For the couple of people who do read what I write – thank you.

 

Calculate powerset of a set

This post is an attempt to understand how the execution for the calculation of a PowerSet is done for a given list of input. Note that I said calculation, since this is a set calculation technically. The idea comes from this link. The Haskell code tries to mirror the algorithm given in the Wiki.

The basic idea of the algorithm is this – take one element from the set and concatenate (perform union in Mathematical terms) this element with the powerset of the remaining elements. And to calculate the powerset of the remaining elements, continue the same process recursively. And the powerset of an empty list [] is a list with an empty element [[]], which is the terminating condition for the recursion. The Haskell code on the site is simple looking, but packs a good punch. map in Haskell is like map in any other language – it applies the function given to every element in the list. In the code below the (x:) means concatenate x in the beginning. And the ++ operator is to add an element to the end of a list

powerset :: [a] -> [[a]]
powerset [] = [[]]
powerset (x:xs) = powerset xs ++ map (x:) (powerset xs)

Let us walk through the execution of the recursive algorithm. For the sake of brevity, I am going to name the function ps

ps [1,2,3] = ps [2,3] ++ map (1:) (ps [2,3])
                     ^
                     |--- (1)
ps [2,3] = ps [3] ++ map (2:) (ps [3])
             ^
             |-- (2)
ps [3] = ps [] ++ map (3:) (ps [])
           ^
           |-- (3)
= [[]] ++ map (3:) (ps [])
                     ^
                     |--(4)
= [[]] ++ map (3:) ([[]])
            ^
            |--- (5)
= [[]] ++ [[3]]
= [[],[3]]
  --- so, the result of (2) is ps [3] = [[],[3]] -- (Result1)

Now we come back to the ps[2,3]. So,
ps [2,3] = [[],[3]] ++ map (2:) (ps [3])
=> ps [2,3] = [[],[3]] ++ map (2:) ([],[3]) -- from Result1
=> ps [2,3] = [[],[3]] ++ [[2],[2,3]]
=> ps [2,3] = [[],[3],[2],[2,3]] -- (Result2)

Now we come back to ps [1,2,3]
As noted above,
ps [1,2,3] = ps [2,3] ++ map (1:) (ps [2,3])
=> ps [1,2,3] = [[],[3],[2],[2,3]] ++ map (1:) (ps [2,3])
     -- from (Result2) we know ps[2,3]
=> ps[1,2,3]=[[],[3],[2],[2,3]] ++ map (1:) ([],[3],[2],[2,3])
=> ps[1,2,3]=[[],[3],[2],[2,3]] ++ [[1],[1,3],[1,2],[1,2,3]
=> ps[1,2,3]=[[],[3],[2],[2,3],[1],[1,3],[1,2],[1,2,3]]

I have reused the results mentioned in Result1 and Result2, but during execution, those results are re-computed ( I am not 100% sure on this, since there is a possibility of partial evaluation and storing the results of the partial evaluation in Haskell)

How to lie with Statistics – a good cheat-sheet to cheat with numbers

The moment someone mentions Statistics, the most often seen reaction is a big yawn or a sigh of disbelief. This is because mostly people would have heard this statistic in some ad / marketing journal or may be from a not so trusty source. And that reaction is justified 86% of the time ;). This sense of disbelief / wonder (and it works both ways) mostly comes because the reader can’t vouch for the veracity of the numbers. The assumption always is that the person talking about the numbers (or the statistic, technically speaking) knows what (s)he is talking about. And even if that is not the case, it is very easy to be lost amongst the numbers – when there are averages, percentages, YoY (year-on-year) growth, percentage points, APRs and myriad other measures that are used to make the person talking about these (who more often than not is selling something) seem either very good or very bad. And looks like all these people seem to have the same field manual – How to lie with statistics. And this doesn’t seem to have changed from 1954, when the book was first published !

OK, before the book gets any more negative connotations – that is not the purpose of the book. The book is to help the reader see through these marketing ploys. To be able to ask the right questions and when to dismiss a statistic as faulty. It is a field manual to beat the cheaters in their own game. And this delightful book comes in a small package – just 150 pages, and is an easy read. The book has 10 chapters each with a specific theme

  1. The sample with a built-in bias : the origin of the statistics problems – the sample. Any statistic is based on some sample (because the whole population can’t be tested) and every sample has some sort of bias, even if the person wanting the statistic tries hard to not create any. The built-in bias comes from the respondents not replying honestly, the market researcher picking a sample that gives better numbers, personal biases based on the respondent’s perception of the market researcher, data not being available at a certain past time are a few of the biases that creep in when building a statistic. One of the example (from the 1950s) that the author mentions is a readership survey of two magazines. Respondents were asked which magazine they read the most – Harpers or True love story. Most respondents came back that they read the True Love Story, but that publisher’s figures came back that the True Love Story had a much higher circulation than Harpers – refuting the results from the sampling. The reason for this discrepancy – people were not willing to respond due to their own bias. As Dr.House says – Everybody Lies ! Summary of the chapter – given any statistic, question the sample that was taken. Assume that there is always a bias in the sample
  2. The well-chosen average: how not qualifying an average can change the meaning of the data. Before I delve into this, quickly, when I say, average – what comes to your mind? Sum(x1….xn) / N – right? The arithmetic mean. But I said average, not arithmetic average did I? Not many people know that there are 3 averages
    • Arithmetic average / mean – sum of quantities / number of quantities
    • Median – the middle point of the data which separates the data, the midpoint when data is sorted
    • Mode – the data point that occurs the most in a given set of data

    And when someone says average, leaving it unqualified, there is a lot of room for juggling. The author mentions a very simple example. If an organization publishes a statistic that the average pay of the employees is $1000, what does this mean? This makes most of us think that almost everyone makes around $2000 – the reader thinks it is the median. But, the corporation can be talking about an arithmetic mean, where the boss might be earning say $10,500 and the rest of the 19 employees earn $500 each – the arithmetic average. Just by not qualifying the average the published fact can be completely twisted out of form from the real facts.The way out – always ask what is the kind of the average that someone is talking about.

  3. The little figures that are not there: This chapter is about how the sample data is picked up in a way to prove the results – something we are all too aware in marketing campaigns. And picking the sample data right can mean picking a sample size that gives the kind of results we are looking for or a smaller number of trials. The author demonstrates this with a very important issue for parents – is my normal or not. The author talks about the ‘Gesell Norms’, where Dr.Arnold Gesell stated that most kids sit erect by the age of two. This immediately translates to a parent trying to think about his/her kid and deciding whether the kid is normal or not. What is missing in this case is, that, from the source of the information (the research) to the Sunday paper where a parent read this, the average has been changed from a range to an exact figure. If the writer of the Sunday magazine article mentioned to the reader that there is a range of age in which a child sits erect, the reader is assuaged and that is where the little figures disappear. The way out – ask if the information presented is a discrete quantity or if there is a range involved.
  4. Much ado about practically nothing: This little chapter is about errors in measurement. There are two measures for measuring error – Probable Error and Standard Error. The probable error measures the error in the measurement based on how much off is your measurement device. For example, if you were using a measuring scale that is 3 inches off a foot, then your measurement across trials is +/- 3. This kind of difference becomes important when there are business decisions taken based on a positive or negative result.
  5. The Gee-Whiz graph: This one is something that we see quite often. How to manipulate a graph so that it shows an inflated / deflated picture (based on what you are plotting on the graph). Some tricks include – miss out the measure of the axis, don’t label the axis leaving only numbers and hence letting the reader make his/her own assumptions.
  6. The one-dimensional picture: This one is an interesting trick. The trick here is use some sort of symbol – a money bag, a factory symbol things like that on the graph. So, when measuring the growth of, say the factory, increase the size of the factory image – and increase it across all the dimensions. An example – you try to display the difference in pay-scale. If it were a bar-chart, you’d have one bar with a measure of (say) 10 and another one of (say) 30. So the 1:3 ratio is clear when you see the bar chart. Now picture a money bag of similar proportions – one with a money bag of size 1 and the other one, a much larger one and immediately, the user perceives an increase of 1:9. Why? The money bag is grown 3 times across all the dimensions. Given that the reader is seeing this on a graph, the other dimensions are forgotten and the large money bag gives the impression that there is a much larger difference than 1:3 !
  7. The semi-attached figure: This one is a Sir Humphrey Appleby classic (the one used in A real partnership). And what is the idea – very simple. If you can’t prove what you want to prove, prove something else and demonstrate that they both are the same! Things like you can’t prove that your drug can cure cold, but you can prove that it kills 32,132 cold germs, more than any other drug. And then wait and watch for the consumer to make that assumption that there is a connection. The author talks about percentages in this chapter, where growth can be measured in percentages or percentage points. And when it comes to percentages – you have the classic trap. A growth of 10% means what? Measuring against the last year’s production or some arbitrary year that you decide to pickup? And if it was some other year in the past, how is the reader to make that mental leap to know if the 10% growth is actually in real terms – very difficult indeed. And then there is the percentage point – a drop of 3 percentage points gives a much softer blow than saying a loss of 23% (say a drop from 13% to 10% in real terms).
  8. The post hoc rides again: I guess this is my favorite chapter of the book. Post-hoc analysis. The cause and effect problem. You have the effect and then you go around shopping for a cause that you want to portray. And what is to stop someone to make that connection? The author brings up an example of smoking being related to bad grades. Now in this case, is it that smoking was the cause of the bad grades or was it that the individuals who were getting bad grades decided to take up smoking? How is one to contest such an assumption? The way out – mostly is to ask for when the data was collected and if enough data was available during the entire course of the experiment (there is also the possibility of deducing based on the time when there was no data available – how do you contest something when there is no data available at all !).
  9. How to statisticulate: In a term that the author coins, statisticulate is to manipulate readers by using statistics. This chapter is a cheat sheet to what to watch out for. The author lists various tricks – things like measuring profit on cost price, showing a graph with a finer Y-axis scale just to show how steep growth is, how income calculations mislead by involving children in the family as individuals for the average amongst a few.
  10. How to talk back to a statistic: This one is a cheat-sheet for the reader. What should one ask to find out if the statistic being presented is genuine or not. There are a few questions one can ask
  • Who says so – who is the one who is publishing the result? Do they have anything to gain from the result
  • How does he know – how did they measure this result? Was there any sampling bias?
  • Did someone change the subject – Are the cause and effect getting muddled?
  • Does it make sense – ’nuff said 🙂

And that summarizes the book. This book is a must-read if you are interested in numbers and their interpretations. And, the tricks mentioned in this old book are in use even now. So, the date of publishing and the relevance have a negative correlation ;). Definitely recommended read. An 8/10 – looses out because some of the topics were not given a thorough treatment.

The Silent World – an ok read

This book – The Silent World by Capt. Jacque Cousteau sort of reached me out of chance. And when I checked the title, I saw that this was a National Geographic Adventure classic – not the sort of a book I would have picked up myself (or would have known about). And this is probably one of the few books I didn’t check any reviews before I started reading it – and I am not disappointed at all. For someone who didn’t know anything about scuba diving (other than seeing it on TV), this book has been sort of a revelation. This book is not about the science behind scuba diving, though one does pick a few facts here and there. What this book is about the journey of Captain Jacques Yves Cousteau as he discovered scuba diving. Well, not really scuba diving but the aqua-lung. Another thing which I learnt after I finished reading the book was that it was published first in 1953; which makes sense when I think about it now – the final chapter sort of felt a bit abrupt.

So, what is this book about? The book describes the captain’s experiences as he discovers the undersea world – the world that we now conveniently get to see on TV. He describes the underwater life and the richness of the experience one feels during scuba diving. There are also details about some of his expeditions for treasure hunts and shark trips. The treasure hunts, contrary to my expectations were not very interesting reads. The shark close-ups chapter was much better. The best part of the book for me was the description of the details that the captain goes to for undersea cave diving. I couldn’t always visualize some of the problems regarding the challenges faced by scuba divers that the captain describes and that is why this book falls short of my expectations. As a pioneer in the field, I expected the captain’s narrative to be a lot more exciting and vivid. May be the adrenaline rush that I was looking for was lost in translation.

The best part of the book as I mentioned was the description of the undersea life during the captain’s cave diving expeditions off the Mediterranean. That did make me realize the joy one would have experienced seeing underwater life with such clarity for the first time. At-least to be part of the experience one should pick up this book. A 6/10.

Linked – an average read

A book about power laws and scale free networks. The book explains how various social and network phenomenon are all scale-free networks. That we are surrounded by the application of power laws. Interesting book but an insipid narrative.

With a title like linked one is definitely intrigued about the content. This book is interesting and uninteresting in equal proportions. What is interesting is that the author manages to make the reader look at various phenomenon under the network theory glass. What is uninteresting is that the treatment of various areas is shallow. First of all what is this book about – the book is about trying to understand how many facets of the world around us are linked to each other. Now, this doesn’t seem to be a great revelation; everyone knows that. What the author gets into details about is how scale free networks are able to describe everything from the structure of a cell to the structure of the social networks. According to the author, any phenomenon which is linked to anything else in nature eventually follows some sort of a power law.

Before I go further, there are two things one needs to understand

  • Power laws
  • Scale free networks

Power law:Before understanding power laws, one needs to know what random networks are. A random network is one in which there is an equal probability for any two nodes to be connected (probability speak: equally likely events). So, there is no preference for connecting any two nodes. So, if random networks were around us, then the probability of me knowing the author of the book is the same as me knowing someone I studied with. When this network is plotted on a graph, then it follows the very familiar bell curve. The main characteristic of the bell curve is that the tail of the bell curve decays very fast i.e. the number of data points in that tail will be very less and they keep reducing. And that is what is different between a random network and a network following a power law. The main characteristic of a network following a power law is that

  1. There is no peak for the plotted graph
  2. The decay of the tail of the network is much more pronounced i.e. there is a very long tail
  3. If the network were to be scaled (say increasing the number of connections by some constant factor), then the new network will grow only by the constant factor (a linear increase in computational complexity terms). This is unlike, say an exponential growth – growth of the network where the growth is a function of an exponent of the constant factor (think of the old tale of the man asking the king to fill a chess board with grains which grow as a power of 2).

The power law is also called the long tail, 80-20 rule,small world network or the pareto distribution. Essentially, there is a large number of phenomenon which are common and there are a few phenomenon which are rare and occur amongst these.

Scale free network: A scale free network is one whose nodes follow a power law i.e. the the number of nodes that are connected in the network are proportional to the power of the number of nodes itself. More here (pdf link) and here
Here is a simple example to understand random and scale-free networks (from the book)

  • Random network: The network of the inter-state highways in the US. Every major city in the US is connected by almost the same number of highways. So, all the cities fall within the central part of the bell curve. And there are few cities which have a lot of highways connecting them or very few – the tails of the bell curve.
  • Scale free network: The network of the flights and airports in the US. In the case of flights though, not all the airports are equal. There are some hubs in the air network which have large number of flights in and out and there are small satellite airports all across the country. For example, Atlanta is a big hub in the south and there are smaller airports around Atlanta which have much lesser flight traffic

That is it – the above is the central idea of the book. Once you get that, the rest of the book gets into the uninteresting bit ! Well, not that uninteresting though. The author goes on to describe how power-laws and scale free networks are all around us. From our social network (more on this further) to the structure of the cell; from the network economy to the links on the WWW. And what the author also explains is how these large ‘hubs’ also are the failure points of the network. Bring down the hubs and the whole network collapses. He explains this with the examples of the blackouts that happened in 1996 in the US. The same hubs occur everywhere – from the hubs connecting the Internet to the hubs in our social network – there always is that one person who stays in touch with everyone in the class. Remove that person from the network and the class-mates lose touch with each other (am not sure how much this is true in the new world of social networking sites). The author goes on to explain the various properties of this scale free network in simple language, but doesn’t go in depth into any of the topic. For example, when explaining of the resilience to failure of a scale free network, the author just mentions – so this network is going to be stable even if majority of the nodes go down, the network goes on because that essentially is the feature of a scale-free network. But he doesn’t touch upon the fact that the remaining nodes now will get overloaded and will collapse under their own weight. A bit more explanation, even if that were mathematical would have been welcome in the book and that is the downside of the book.

When I mentioned the social network above, it was interesting to me because when the book came out in 2003, there might not have been many ways to map the human network. So, the author whimsically wishes, if we were to know the human network, it can pop up a lot of interesting features. And that is almost possible now. With the influx of the social networking sites, it is possible to find out if indeed there are six degrees of separation between people. And that has implications in the online world. Know the hubs in every small network and you have the ability to market / interest a large set of the nodes connected in your idea. Get the nodes to write something nice about you and the ‘word-of-mouth’ marketing will remove the need for a large marketing budget.

As mentioned on some of the low reviews on Amazon the book falls short of taking any idea further than a shallow treatment. But that doesn’t make it a bad read, just an average read. I think the author stretched the ideas a bit too thin. I’d rate it a 6/10 – just because the author does give good examples for one to get the ideas behind his premise.

Caesar cipher in one line

Can we create an encrypt and decrypt using Caesar cipher in one line? I’m sure you can. Here is one way to do it in Python. This assumes the text is typed in English (or where the lower() function on a string can work).
Encrypt
import sys;print ''.join([chr(x) for x in [x if x else x+ord('a') for x in [ (ord(x.lower())+1)%(ord('z')+1) for x in sys.argv[1]]]])
Decrypt
'import sys;print '.join([chr(x) for x in [x if (x%(ord('a')-1)) else ord('z') for x in [ (ord(x)-1) for x in sys.argv[1]]]])
So, now you can send your next missive in shift+1 cipher ! Of course, you are not going to send trade secrets that way, would you? 🙂

Install Python-MySql on Windows 7 64-bit

Install Python-MySQL on Windows 7 64-bit

I’ve been wanting to try my hand at SQLAlchemy for a while now. No particular reason other than checking how the APIs for the ORM are. To decide on the database to use, I thought MySQL might be a good idea because most of the OSS libraries support MySQL. But the catch was that I wanted to do all this on Windows 7 and that too a 64-bit Windows 7. So, this was a bit tricky. I couldn’t do something like easy_install MySQL-Python on Windows. So, here is a step-by step guide to install MySQL 64-bit, Python-MySQL on Windows 7. The steps might work with Windows Vista too.

Pre-requisites

  1. Python 2.7 – might not work with Python 3
  2. Visual Studio 2008. The Visual Studio express edition will also work. Essentially, a C++ compiler is required
  3. Access to modify the registry
  4. MySQL DB 5.5 on Windows

Installing Python-MySQL

  1. Download the combined installer of the MySQLDB on Windows from MySQL. The current version is 5.5
  2. Download the Python-MySQLDB library here – Python-MySQLDB. The current version is – MySQL-python-1.2.3
  3. You need to have the Microsoft Visual Studio installed on your machine. I had Visual Studio 2008 installed, but this can work with the Microsoft Visual Studio Express edition too
  4. Install the MySQLDB – select the ‘developer’ configuration during install. This will install the required libraries and the header files for the C connector- these are important. Note the directory that you are installing the MySQL
  5. This will install MySQL 64-bit on the machine. I am running Windows 7 64-bit and hence I installed the 64-bit version of MySQL
  6. Extract the MySQL-python-1.2.3 into a directory
  7. Open the site.cfg and make the following modification
    -registry_key = SOFTWARE\MySQL AB\MySQL Server 5.0
    +registry_key = SOFTWARE\Wow6432Node\MySQL AB\MySQL Server 5.5

    Based on the version of the MySQL server, change the 5.5 to whatever is the version you installed. On Windows 7 (and I think on Vista too) 64-bit, the registry key is changed to this location during installation. This took me a long time to find. This has been documented here too – MySQL-Python forum (check for the comment no. 13)
  8. Then modify the setup_windows.py by adding the lib_dirs and include_dirs. Here we add the directories for the C connector that was installed as part of the installation of MySQL. I could not locate the registry key for the connector, so I added the directories for the headers and the libraries to the compiler parameters. Note that I added the opt-imized library directory. If you are debugging the MySQL connector, you will want to include the debug version of the libraries
    library_dirs = [ os.path.join(mysql_root, r'lib\opt'), "C:\Program Files\MySQL\Connector C 6.0.2\lib\opt" ]
    include_dirs = [ os.path.join(mysql_root, r'include'), "C:\Program Files\MySQL\Connector C 6.0.2\include" ]
  9. One final step and you are ready to go. As documented – here, you need to modify the msvc9compiler.py. Look for the line containing
    ld_args.append('/MANIFESTFILE:' + temp_manifest) and add a new line – ld_args.append('/MANIFEST')
  10. Now install the python library using – python setup.py install in the MySQL-python-1.2.3
  11. This will use the Visual Studio compiler and the file mentioned in the include_dirs, library_dirs to build the .egg file for the MySQL-Python libraries
  12. Some more help here
  13. Also can check this on SO though for the 64-bit windows, I found that this solution did not work