python

Get an element before a string with Beautiful Soup  #angularjs #reactjs

  • I would like to be able to set Beautiful Soup/Python to parse for a string like “Important Values” and get the element directly before it (ignoring any line breaks or white-space), or better the value contained within the element.
  • The problem I’m having is that the website uses some very vague class names for the elements I need (“list-item”) that are reproduced in other elements, which I don’t want to grab.
  • There are all sorts of techniques to find elements in the HTML.
  • def search_function(tag): is_strong = tag.name == “strong” is_important = tag.next_sibling and tag.next_sibling.strip() == ‘Important Values’ return is_strong and is_important important_values = int(soup.find(search_function).
  • I’m using Beautiful Soup to search a website for a set of integer values and produce a list of these, matched to names.

I’m using Beautiful Soup to search a website for a set of integer values and produce a list of these, matched to names. However, the problem I’m having is that the website uses some very vague class names for the elements I need (“list-item”) that are reproduced in other elements, which I don’t want to grab. So far my code looks like:

@ng_real_ninja: Get an element before a string with Beautiful Soup #angularjs #reactjs

I’m using Beautiful Soup to search a website for a set of integer values and produce a list of these, matched to names. However, the problem I’m having is that the website uses some very vague class names for the elements I need (“list-item”) that are reproduced in other elements, which I don’t want to grab. So far my code looks like:

However, this is also returning a whole bunch of stuff I don’t want. Is there a way I make it so Beautiful Soup only returns the contents of elements which are followed by a certain string? So, if the web-page contains a section that’s like:

I would like to be able to set Beautiful Soup/Python to parse for a string like “Important Values” and get the element directly before it (ignoring any line breaks or white-space), or better yet the value contained within the element. So in this case Beautiful Soup would either print:

or, more preferably, just:

Is this possible?

python

You might also like More from author

Comments are closed, but trackbacks and pingbacks are open.