Tagged: Python Toggle Comment Threads | Keyboard Shortcuts

  • Harshad Joshi 4:07 pm on April 11, 2010 Permalink | Reply
    Tags: Python   

    Python Pits.. 

    Q. How many times do we require to convert a python tuple to plain string?
    A. It was during one situation that I required to collect some data directly from the database. If this wasent enough, I had to send the data to a mobile using sms. I used MySQLdb and connected to the database.The program was as follows.

    import codecs
    import MySQLdbconn = MySQLdb.connect(host=’localhost’,use_unicode = True, charset = “utf8″, user=’harshad’,passwd=”,db=’statusnet’)

    cursor=conn.cursor()

    p=cursor.execute(“””select content from notice where profile_id = 1″””)

    g=cursor.fetchall()

    q=[]
    for i in g:
    j=str(i)
    q.append(codecs.encode(j))

    print len(q)

    for e in q:
    print codecs.encode(e)

    The output of the program was as follows.

    python statusnet.py
    5
    (u’Traces, many faces, lost till the end of time… http://localhost/statusnet-0.8.2/index.php/attachment/1′,)
    (u’hello.c http://localhost/statusnet-0.8.2/index.php/attachment/2′,)
    (u’Hallo.’,)
    (u’Wussup??’,)
    (u’!harshad hi..’,)
    [“(u’Traces, many faces, lost till the end of time… http://localhost/statusnet-0.8.2/index.php/attachment/1′,)”, “(u’hello.c http://localhost/statusnet-0.8.2/index.php/attachment/2′,)”, “(u’Hallo.’,)”, “(u’Wussup??’,)”, “(u’!harshad hi..’,)”]
    root@indiaforce:~# python statusnet.py
    5
    (u’Traces, many faces, lost till the end of time… http://localhost/statusnet-0.8.2/index.php/attachment/1′,)
    (u’hello.c http://localhost/statusnet-0.8.2/index.php/attachment/2′,)
    (u’Hallo.’,)
    (u’Wussup??’,)
    (u’!harshad hi..’,)
    [“(u’Traces, many faces, lost till the end of time… http://localhost/statusnet-0.8.2/index.php/attachment/1′,)”, “(u’hello.c http://localhost/statusnet-0.8.2/index.php/attachment/2′,)”, “(u’Hallo.’,)”, “(u’Wussup??’,)”, “(u’!harshad hi..’,)”]

    Wow..look at the output. I got a nice tuple. Now do I send a tuple as sms??

    After some time I realized that there must be a way to convert all tuples to a string and then send the strings either to the screen or to the mobile or anywhere…

    Here is the code..

    import MySQLdb
    import codecsconn = MySQLdb.connect(host=’localhost’,use_unicode = True, charset = “utf8″, user=’harshad’,passwd=”,db=’statusnet’)

    cursor=conn.cursor()

    p=cursor.execute(“””select content from notice where profile_id = 1″””)

    y=[]
    while (1):
    a=cursor.fetchone()
    if a == None:break
    print a
    y.append(a)

    print y

    g=[]
    for i in y:
    s=i
    b=[j.encode(“utf-8”) for j in s]

    for i in b:
    print b
    g.append(i)

    print “g > “,g

    Output

    python statue.py
    (u’Traces, many faces, lost till the end of time… http://localhost/statusnet-0.8.2/index.php/attachment/1′,)
    [(u’Traces, many faces, lost till the end of time… http://localhost/statusnet-0.8.2/index.php/attachment/1′,)]
    (u’hello.c http://localhost/statusnet-0.8.2/index.php/attachment/2′,)
    [(u’Traces, many faces, lost till the end of time… http://localhost/statusnet-0.8.2/index.php/attachment/1′,), (u’hello.c http://localhost/statusnet-0.8.2/index.php/attachment/2′,)]
    (u’Hallo.’,)
    [(u’Traces, many faces, lost till the end of time… http://localhost/statusnet-0.8.2/index.php/attachment/1′,), (u’hello.c http://localhost/statusnet-0.8.2/index.php/attachment/2′,), (u’Hallo.’,)]
    (u’Wussup??’,)
    [(u’Traces, many faces, lost till the end of time… http://localhost/statusnet-0.8.2/index.php/attachment/1′,), (u’hello.c http://localhost/statusnet-0.8.2/index.php/attachment/2′,), (u’Hallo.’,), (u’Wussup??’,)]
    (u’!harshad hi..’,)
    [(u’Traces, many faces, lost till the end of time… http://localhost/statusnet-0.8.2/index.php/attachment/1′,), (u’hello.c http://localhost/statusnet-0.8.2/index.php/attachment/2′,), (u’Hallo.’,), (u’Wussup??’,), (u’!harshad hi..’,)]
    [‘Traces, many faces, lost till the end of time… http://localhost/statusnet-0.8.2/index.php/attachment/1’%5D
    [‘hello.c http://localhost/statusnet-0.8.2/index.php/attachment/2’%5D
    [‘Hallo.’]
    [‘Wussup??’]
    [‘!harshad hi..’]
    g >  [‘Traces, many faces, lost till the end of time… http://localhost/statusnet-0.8.2/index.php/attachment/1’, ‘hello.c http://localhost/statusnet-0.8.2/index.php/attachment/2’, ‘Hallo.’, ‘Wussup??’, ‘!harshad hi..’]

    I dont know how many times we land up in a similar situation, now that we have got a list, we can easily iterate through it and get a sleek string as output. No hassles at all.. It still shows some hiccups, nevertheless it works..kludgy but effective. Can work for any amout of rows that we need..

    Main intention of writing this place is that I couldent find a better example on google and wondered if I write it, might be useful, and if someone is finding a solution for it, here is it.. 🙂

    End of Logs.

    Posted via email from [root@localhost /root]#

    PS – In case you wonder I havent added any exception handling code..it is one of my prime mottoes not to write code generating exceptions. so far its been good. 😉

    print q

    Advertisements
     
  • Harshad Joshi 5:25 pm on July 17, 2008 Permalink | Reply
    Tags: feedparser.py, Google News, Mark Pilgrim, Python   

    How to write a kludgy news crawler in Python and challenge Google News to its limits 

    A kludge (or, alternatively, kluge) is a clumsy or inelegant solution to a problem or difficulty. In engineering, a kludge is a workaround, typically using unrelated parts cobbled together. Especially in computer programs, a kludge is often used to fix an unanticipated problem in an earlier kludge; this is essentially a kind of cruft.

    I was searching data on my old disk and I found some interesting code I had written(rather abandoned) an year and half ago. At that time, I was very fascinated by the concept of Google News, which scanned and gathered news from almost 450 sources and mash up them together on one single page.  Mnay sources, one destination. Needless to say, Google created a smash hit product.Life appeared easy, all of sudden.

    Given my nature, it wasent surprising  that I desired to write the next Google News Killer app. It began at night…around 10:30 to be precise. I was determined to finish the program in a nights time. Python was my original (and only) choice that seemed suitable for me to create the next biig thing. Googling around I found that a module feedparser.py makes parsing RSS feeds easy(so to say). However, there was a problem – At that time, I had no clue of what XML meant. That was only the beginning. Later, I also discovered that I had extremely limited knowledge of HTML..Then I realized that my Python basics were giving me plenty of surprises…

    Bah..it looked so bad, here I was trying to write a good program, and there were tonnes of difficulties in first path itself. However, determination took over desperation, and after tweaking and pondering for well over 46 minutes, I was able to produce an extremely kludgy , extremely basic, extremely primitive Google News Killer – Wow…..the feeling was so good.  Imagine – writing something out of scratch, and that too without any help(ok, I took help from Mark Pilgrims feedparser.py and python.org) I chose to call it News Crawler.

    Get the python file by clicking the link – check-news Dont forget to rename it the file to check-news.py and also, make sure that identions are proper.

    Now something about the code.

    1. As I said earlier – the code is extremely dumb, extremely kludgy, extremely primitive, extremly basic and theres lot of shoddiness in there. Dont laugh at it even if it appears funny.

    2. The code has heard nothing of security, and is meant to run under controlled environment.

    3. It dosent make use of any SQL database backend, but is wise enough to store the RSS feeds on HDD before dissecting them,and extracting useful content.

    4. It expects that the XML files are in Unicode format. Some rouge sites make use of shabby encoding, which raises an exception in the program.

    5. I havent added any exception handling, just laziness, nothing more.

    6. For reference, I have shown how we can incorporate Slashdot and Reddit feeds on single page. You can add in your favourite feed.

    ToDo

    1. Make use of a good HTML templeting system.

    2. Solve the problem of unicode.

    3. Add error checking and improve its utility by making use of Pythons object oriented features.

    4. Add a SQL backend system for storing the parsed RSS data. To be honest, its the toughest job to do.

    5. Post up a nice powerpoint presentation describing the system. 🙂

    5. PS – I will definately not do anything of above unless someone seriously decides to fund me.

    After a long time I am back to programming world, I got so busy with other things that I had to abandon my dream project, but who knows, someday it may come true..  😉

     
    • Sandy 6:58 pm on July 20, 2008 Permalink | Reply

      hi, I was trying to perform a similar act when I bumped into ur blog. Unfortunately I am nbot able to download your .py script. Can you please email it to me

      san.grad@gmail.com is the Id.

      Thaks in Advance.

    • lobiga 6:24 pm on March 30, 2009 Permalink | Reply

      Hey
      cann you send me the source pls .

c
Compose new post
j
Next post/Next comment
k
Previous post/Previous comment
r
Reply
e
Edit
o
Show/Hide comments
t
Go to top
l
Go to login
h
Show/Hide help
shift + esc
Cancel