1. How to Program, Part I
  2. How to Program, Part II
  3. How to Program, Part III
  4. How to Program, Part IV
  5. How to Program, Part V
  6. exercises
  7. pyMPI tutorial
  8. Calculating PI, Part I
  9. Calculating PI, Part II
  10. Calculating PI, Part III
  11. Poogle - Web Search
  12. Mandelbrot Sets
  13. Mandelbrot, The Code
  14. Mandelbrot, The Images
  15. Conway's Life, Part I
  16. Life Code Listing
  17. Conway's Life, Part II
  18. MPI Life Code Listing

Poogle - Web Search

Our next application is a web search program.

It will take a series of search terms (the items in the list called "inputs") and search through the various web pages (in the "pages" list) and count the number of pages where a match is found.

You can edit the list of pages and/or search terms to find your own results.

file: poogle.py
1 # Challenge, return number of matches within a page
2 import mpi
3 import urllib
4 
5 # what we are searching for
6 inputs = [
7     "seidel",
8     "beck",
9     "loni",
10     "eastman",
11     "brandt"]
12 
13 # collection of web pages our search engine knows about
14 pages = [
15   'http://www.cct.lsu.edu/about/overview/',
16   'http://www.cct.lsu.edu/projects/Coastal_Modeling/',
17   'http://www.cct.lsu.edu/projects/SURASCOOP/',
18   'http://www.cct.lsu.edu/projects/CactusCode/',
19   'http://www.cct.lsu.edu/about/employment/employment.php',
20   'http://www.cct.lsu.edu/projects/GridChemCCG/',
21   'http://www.cct.lsu.edu/about/people/faculty/all.php',
22   'http://www.cct.lsu.edu/about/focus/',
23   'http://www.cct.lsu.edu/projects/Enlightened/',
24   'http://www.loni.org/',
25   'http://www.loni.org/plan/',
26   'http://www.cct.lsu.edu/news/news/289']
27 
28 n = len(pages)/mpi.size
29 ilo = mpi.rank*n
30 ihi = (mpi.rank+1)*n-1
31 
32 # each mpi proc searches a subset of pages from ilo to ihi
33 c = range(ihi+1)
34 for i in range(ilo,ihi+1):
35     c[i] = urllib.urlopen(pages[i]).read().lower()
36 
37 for input in inputs:
38     matches = []
39 
40     for i in range(ilo,ihi+1):
41         if c[i].find(input) >= 0:
42             matches.append(pages[i])
43 
44     # proc zero receives the results of all searches
45     if mpi.rank == 0:
46         for i in range(1,mpi.size):
47             other_matches = mpi.recv(mpi.ANY_SOURCE)[0]
48             for match in other_matches:
49                 matches.append(match)
50     else:
51         mpi.send(matches,0)
52 
53     if mpi.rank == 0:
54         print input,len(matches)
> mpirun -np 2 pyMPI poogle.py
seidel 2
beck 0
loni 4
eastman 0
brandt 0