Tuesday, June 26, 2007

Factorials of Large Numbers

4023872600770937735437024339230039857193748642107
146325437999104299385123986290205920442084869694
048004799886101971960586316668729948085589013238
2966994459099742450408707375991882362772718873251
9779505950995276120874975462497043601418278094646
49629105639388743788648733711918104582578364784997
7012476632889835955735432513185323958463075557409
11426241747434934755342864657661166779739666882029
1207379143853719588249808126867838374559731746136
08537953452422158659320192809087829730843139284
440328123155861103697680135730421616874760967587
134831202547858932076716913244842623613141250878
020800026168315102734182797770478463586817016436
502415369139828126481021309276124489635992870511
496497541990934222156683257208082133318611681155
36158365469840467089756029009505376164758477284
21889679646244945160765353408198901385442487984
95995331910172335555660213945039973628075013783
76153071277619268490343526252000158885351473316
117021039681759215109077880193931781141945452572
23865541461062892187960223838971476088506276862
96714667469756291123408243920816015378088989396
451826324367161676217916890977991190375403127462
22899880051954444142820121873617459926429565817
46628302955570299024324153181617210465832036786
90611726015878352075151628422554026517048330422
61439742869330616908979684825901254583271682264
58066526769958652682272807075781391858178889652
20816434834482599326604336766017699961283186078
83861502794659551311565520360939881806121385586
0030143569452722420634463179746059468257310379
0084024432438465657245014402821885252470935190
62092902313649327349756551395872055965422874977
40114133469627154228458623773875382304838656889
7646192738381490014076731044664025989949022222
1765904339901886018566526485061799702356193897
01786004081188972991831102117122984590164192106
88843871218556461249607987229085192968193723886
42614839657382291123125024186649353143970137428
53192664987533721894069428143411852015801412334
4828015051399694290153483077644569099073152433
2782882698646027898643211390835062170950025973
89863554277196742822248757586765752344220207573
6305694988250879689281627538488633969099598262
8095612145099487170124451646126037902930912088
9086942028510640182154399457156805941872748998
0942547421735824010636774045957417851608292301
353580818400969963725242305608559037006242712
4341690900415369010593398383577793941097002775
347200000000000000000000000000000000000000
00000000000000000000000000000000000000000
00000000000000000000000000000000000000000
00000000000000000000000000000000000000000
00000000000000000000000000000000000000000
00000000000000000000000000000000000000000
000000

That is the value of factorial 1000. Writing a program to print the factorial of a number is one of the most basic and almost elementary programming exercises. But naturally the simple one that we wrote in class is limited in that it cannot output the factorial of numbers beyond 20 (64 bit integers). Below is the description of a method that would generate the factorial of numbers far greater than 20. I believe the method is correct - I know for sure that the output is perfect for 1! to 5000!.

As one would know, fact(n) = 1 * 2 * 3 * ..... * (n-1) * n. To generate factorials of large numbers, the key is multiplication. If we knew how to multiply large numbers, the job is done. When I say large numbers, I mean numbers with 50 digits each or even more. To achieve this, let us just recall the basic multiplication procedure that we learnt in our 3rd grade.




NOTE:-There are few things to be noticed in any multiplication. The last digit of the answer is determined only by the last digits of the two numbers being multiplied. For example,
3*4 or 23*44 783687268732683*2313123124 would all end with 2. Similarly the last two digits of the answer are determined only by the last two digits of the two numbers being multiplied. 12*24 or 912*524 or 1231123211212*7272896782687687624 would all end with 88. So the basis of the entire logic lies in the fact that parts of the answer are affected only by parts of numbers being multiplied.


A careful look at the above figure should give us an idea. At a point of time, we are not multiplying more than two single digit numbers. Thus, with careful handling of the numbers, we can be sure that the holistic procedure shown in the diagram can be used to multiply two numbers, each having 1000 digits or even more. All we do is save our answer at every step and finally add all the intermediate answers to obtain the final answer. This is the standard multiplication procedure. The logic behind it is easy to understand.

4567 * 8767
= 4567*(7+10*6+100*7+1000*8) = 4567*7 + 10*(4567*6) + 100*(4567*7) + 1000*(4567*8)

Let T1 = 4567*7, T2 = 10*(4567*6), T3 = 100*(4567*7), T4 = 1000*(4567*8)
Referring to the figure above, we can say that the first number in the multiplication (31969) is actually value of T1, the second number (274020) is the value of term T2 and so on. We add all the values to obtain the final answer as 40038889.

Let us now generalize the multiplication procedure. Take two numbers
abcd and efgh where a,b,c,d,e,f,g,h are the digits of the numbers.

m = abcd * efgh
m = abcd * (h + 10*g + 100*f + 1000*e)
= (d + c*10 + b*100 + a*1000) * (h + 10*g + 100*f + 1000*e)
= (d*h)+10*(d*g + c*h)+100*(b*h + c*g + d*f)+1000*(a*h + b*g + c*f + d*e)+ ... ---(1)

m = d1 +10*d2 +100*d3 +1000*d4 + ........ 10^(n-1)*dn ---(2)
where each of the d's <= 9

Our main idea is to write a program to do the multiplication. If we get equation (2), i.e. if we manage to obtain the values of d1, d2, ..... dn, then the final answer is the number formed by concatenating the digits d1...dn in reverse order, i.e., dn, dn-1 ..... d1. So our aim reduces to obtaining (2).

(1) is easily achievable. So all we have to do is transform (1) into (2). Though (1) and (2) look similar, there is a critical difference. For instance, d*h is not necessarily equal to d1.
The numbers d1, d2, ... are all lesser than or equal to 9. d*h is not necessarily less than or equal to 9. Same is the case with d2 and (d*g + c*h).

The main problem lies in the fact that d*h, for example, might be a two-digit number. Let d*h = AB where A,B, are single digit numbers. Now,
d*h = 10*A + B.
So (1) can be written as
m = B + 10*(d*g + c*h + A) + ......

"...This basically means that in the case where we get an intermediate answer as a number with two digits or more, we just take it as a carry...".
I shall use this basic idea to obtain the value of 4567*8767 using (1)

Here a=4, b=5, c=6, d=7, e=8, f=7, g=6, h=7.

d*h=49. Take d1=9 and carry=4
c*h + d*g=42+42=84. Add carry to this. So value becomes 84+4 = 88. Take d2=8 and carry=8
b*h + c*g + d*f=35+36+49=120. Add carry to this. Value=128. Take d3=8 and carry=12
a*h + b*g + c*f + d*e=28+30+42+56=156. Add carry to this. Value=168. Take d4=8 and carry=16
a*g + b*f + c*e=24+35+48=107. Add carry to this. Value = 123. Take d5=3 and carry=12
a*f + b*e=28+40=68. Add carry to this. Value=80. Take d6=0 and carry=8
a*e=32. Add carry to this. Value = 40. Take d7=0 and carry=4
d8=carry=4

When we concatenate d8,d7,.....,d1, we get 40038889!!!!! Which is the answer we want.

Basically, I used (1) itself. The only point is that when I got a value with more than one digit, I just took the last digit of the value into the answer and took the remaining digits as the carry. I took care to add the carry to every intermediate value I obtained. To help writing a program easily, I would like to illustrate the above multiplication procedure in a better way.

a*0 + b*0 + c*0 + d*h=49. Take d1=9 and carry=4
a*0 + b*0 + c*h + d*g=42+42=84. Add carry to this. So value becomes 84+4 = 88. Take d2=8 and carry=8
a*0 + b*h + c*g + d*f=35+36+49=120. Add carry to this. Value=128. Take d3=8 and carry=12
a*h + b*g + c*f + d*e=28+30+42+56=156. Add carry to this. Value=168. Take d4=8 and carry=16
a*g + b*f + c*e + d*0=24+35+48=107. Add carry to this. Value = 123. Take d5=3 and carry=12
a*f + b*e + c*0 + d*0=28+40=68. Add carry to this. Value=80. Take d6=0 and carry=8
a*e + b*0 + c*0 + d*0=32. Add carry to this. Value = 40. Take d7=0 and carry=4
d8=carry=4

In case you did not spot it, all the terms contain a,b,c,d. the other digits follow this format:-
0,0,0,h
0,0,h,g
0,h,g,f
h,g,f,e
g,f,e,0
f,e,0,0
e,0,0,0

This is the method that has to be used to multiply two large numbers. Now coming back to generating the factorial, all we have to do is to repeatedly perform such multiplications and obtain the value of the factorial. Here I present the crux of the code. However, I suggest that you do not look at it. It would be best if you implement this logic on your own, cause it might very well turn out that your implementation is better than mine.

//fact(n)=n*fact(n-1)
/* The method of multiplication described above has been used to obtain the value of fact(n), i.e. fact(n-1)*n */
fact(1)=1; //initialization
INPUT:- An integer n
for i = [2,n]
.....str1 = i;
.....len1 = length(str1);
.....str2 =
len1-1 number of 0s + fact(i-1) + len1-1 number of 0s;
.....len2 = length(str2);
.....carry=0;
.....
for j = [0,len2-len1]
.........val=carry;
.........
for k = [0,len1)
...............val = val + ( str2[j+k] * str1[k] );
.........END
FOR
.........append
val%10 to fact(i);
.........carry=val/10;
.....END
FOR
.....append the digits of carry in reverse order to fact(i)
END FOR
OUTPUT:- The digits of fact in reverse order
NOTE:- str1,str2, fact are all strings.

Once again, I suggest that you implement this logic without looking at the above snippet. However, if you want the code, here it is.

6 comments:

ud said...

Gr8 work Abhiram....
Blogs are really informative..... keep it up.... gud work...

ranjan said...

how do you store these large numbers...please give me the code and if possible explain it.....

Abhi said...

@Ranjan
Give me your email id. I can send it there.

And:-
@ud
Thanks a lot.

Anonymous said...

Can u tell, how to store the large values obtained by calculating factorials? B'coz no existing data type support such large values.

Abhi said...

@anonymous
Check the code that I have uploaded. You will understand how I have stored the values. I have stored the values by using strings.

Anonymous said...

Hi,

I begin on internet with a directory