dslreports logo
site
 
    All Forums Hot Topics Gallery
spc

spacer




how-to block ads


Search Topic:
uniqs
656
share rss forum feed


Jill Ceke

@verizon.net

Awk pattern matching name within records

Hi,

I'm very new to these forums. I was wondering if someone could help an AWK beginner with a pattern matching an actor to his appearance in movies, which would be stored as records. Let's say we have a database of 4 movies (each movie a record with name, studio + year, and actor fields with a blank line separating the four records) and need to pattern match and actor's name and return his name followed by the movies he has appeared in (in chronological order). In this case I would like to pattern match Jennifer Lawrence and create an output file that lists Jennifer Lawrence at the top followed by the names of movies she has appeared in by chronological order, each on separate lines (just an output file of 3 lines). Any help would be appreciated. Thank you!

Casablanca
WB 1942
Humphrey Bogart: Rick Blaine
Ingrid Bergman: Ilsa Lund
Paul Henreid: Victor Laszlo

Hunger Games
Lionsgate 2012
Jennifer Lawrence: Katniss Everdeen
Josh Hutcherson: Peeta Mellark
Liam Hemsworth: Gale Hawthorne

Like Crazy
Paramount 2011
Anton Yelchin: Jacob Helm
Felicity Jones: Anna Gardner
Jennifer Lawrence: Samantha

Raging Bull
United Artists 1980
Robert de Niro: Jake LaMotta
Joe Pesci: Joey LaMotta
Cathy Moriarty: Vickie Thailer

pablo
MVM
join:2003-06-23
kudos:1
Hi,

Why don't you post what you've written and we can work from there? Otherwise your request seems like `do-my-homework-assignment' Which I'm sure that's not what you're doing ... right?

Cheers,
-pablo
--
openSUSE 12.3/KDE 4.x
Assorted goodies (updated URL!): »pablo-blog.blueoakdb.com


Jill Ceke

@verizon.net
Hi, sorry I'm a total beginner at this and so confused by AWK. Here is what I have done so far:

I tried to extract the actors' names and use them as keys using a file i created:

NF==3 {print $2 $1, $0}
NF==4 {print $3 $1 $2, $0}

Now I tried to sort them but I totally lost on the looping structure. Here is my second
file I created to code the loop

NF==3 {print $2 $1, $0}
NF==4 {print $3 $1 $2, $0}

when using the sort function and piping in these two programs and the database I get this:

Felicity Jones: Anna 0
United Artists 1980
Liam Hemsworth: Gale Hawthorne
Ingrid Bergman: Ilsa Lund
Anton Yelchin: Jacob Helm
Joe Pesci: Joey LaMotta
Jennifer Lawrence: Katniss Everdeen
Jennifer Lawrence: Samantha
Josh Hutcherson: Peeta Mellark
Humphrey Bogart: Rick Blaine
Cathy Moriarty: Vickie Thailer
Paul Henreid: Victor Laszlo

I don't know what to do from this point out or if I'm going in the right direction. Also I don't know how to account for the spaces between records and that semi-colon between actor and character role which I'm trying to separate. I really want my output to look like this:

Jennifer Lawrence
Like Crazy
Hunger Games

Any help would be greatly appreciated =)
Thank you!

pablo
MVM
join:2003-06-23
kudos:1
Hi Jill,

Everyone has to start somewhere.

I could write the awk script for you but then you wouldn't learn ... so if you're patient and willing, I/we will help you ... but you gotta give it a shot too.

The first thing I'd suggest you do is in the body of your awk script, figure out how to determine when a new record is being processed. Or when an existing record has come to an end. I believe knowing when this happens in your script is important.

Write that code, add some print's statements to confirm your logic.

Cheers,
-pablo
--
openSUSE 12.3/KDE 4.x
Assorted goodies (updated URL!): »pablo-blog.blueoakdb.com