in Education by
I am starting to work with python again after 8 years. I am trying to do the program with BeautifulSoup and an array argument. I pass the array argument medios to the URL functions count_words, but it doesn't work. Is there a way to fix it or to search the word on multiple websites using BeautifulSoup? import requests from bs4 import BeautifulSoup def count_words(url, the_word): r = requests.get(url, allow_redirects=False) soup = BeautifulSoup(r.content, 'lxml') words = soup.find(text=lambda text: text and the_word in text) # print(words) return len(words) def main(): url = 'https://www.nytimes.com/' medios = { 'Los Angeles Times': ['http://www.latimes.com/'], 'New York Times' : ['http://www.nytimes.com/' ] } word = 'Trump' #count = count_words(url, word) cuenta = count_words(medios, word) # print('\n El Sitio: {}\n Contiene {} occurrencias de la palabra: {}'.format(url, count, word)) print('\n La palabra: {} aparece {} occurrencias en el New York Times'.format(word, cuenta)) if __name__ == '__main__': main() Select the correct answer from above options

1 Answer

0 votes
by
There are 3 problems here The medios is the dict. Hence, you will have to loop through the keys and values to send it to the method as the method only accepts the URL string. BeautifulSoup finds method needs the tag name for it to search else it will return None. If you want to count a number of occurrences of the word, then use count on the string. You have to send User-Agent in a requests code else you will get 403 or 301. import requests from bs4 import BeautifulSoup headers = {'user-agent':"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36"} def count_words(url, the_word): r = requests.get(url, headers=headers) return r.text.lower().count(the_word) def main(): url = 'https://www.nytimes.com/' medios = { 'Los Angeles Times': ['http://www.latimes.com/'], 'New York Times' : ['http://www.nytimes.com/'] } word = 'trump' for web_name, urls in medios.items(): for url in urls: cuenta = count_words(url, word) print('La palabra: {} aparece {} occurrencias en el {}'.format(word, cuenta, web_name)) if __name__ == '__main__': main() Output: La palabra: trump aparece 47 occurrencias en el Los Angeles Times La palabra: trump aparece 194 occurrencias en el New York Times If you want to know more about the Data Science then do check out the following Data Science which will help you in understanding Data Science from scratch

Related questions

0 votes
    I am totally new to Machine Learning and I have been working with unsupervised learning technique. Image shows my ... 3 were given Select the correct answer from above options...
asked Feb 1, 2022 in Education by JackTerrance
0 votes
    Here is my sample string: [true, {"name": "NameofItem", "amount": "1", "price": 100, "sellerName": " ... seem to make that work. Select the correct answer from above options...
asked Jan 19, 2022 in Education by JackTerrance
0 votes
    I want to calculate a percentage, for each id, of True values from all rows of the id. Here an example ... df.num_true/df.num_col1_id Select the correct answer from above options...
asked Jan 19, 2022 in Education by JackTerrance
0 votes
    webbrowser.open('https://api.WhatsApp.com/send?phone=number') I want to send WhatsApp messages to numbers without ... in this link. Select the correct answer from above options...
asked Jan 19, 2022 in Education by JackTerrance
0 votes
    Do I need coding skills for Data Science using Python? Select the correct answer from above options...
asked Jan 17, 2022 in Education by JackTerrance
0 votes
    I'm looking for a decent implementation of the OPTICS algorithm in Python. I will use it to form density-based ... to that cluster. Select the correct answer from above options...
asked Jan 28, 2022 in Education by JackTerrance
0 votes
    I have a 20 x 4000 dataframe in python using pandas. Two of these columns are named Year and quarter. I'd ... anyone help with that? Select the correct answer from above options...
asked Jan 28, 2022 in Education by JackTerrance
0 votes
    I want to save files for each result from the loop. For example, I wrote the code but it only saves one file ... do I modify my code? Select the correct answer from above options...
asked Jan 19, 2022 in Education by JackTerrance
0 votes
    Hi, I am a relatively new programmer and my teacher gave us this problem to fix. Thing is, I have no idea ... wrong with this problem? Select the correct answer from above options...
asked Jan 19, 2022 in Education by JackTerrance
0 votes
    I have a big data frame. I want to replace float 1.0 and 0.0 with the true and false. My code: import pandas ... to do it in one line. Select the correct answer from above options...
asked Jan 19, 2022 in Education by JackTerrance
0 votes
    Can anyone tell me why Python is better than R for Data Science? Select the correct answer from above options...
asked Jan 19, 2022 in Education by JackTerrance
0 votes
    Can anyone tell me why Python is used in Data Science? Select the correct answer from above options...
asked Jan 19, 2022 in Education by JackTerrance
0 votes
    Can anyone tell me whether Python is necessary for Data Science? Select the correct answer from above options...
asked Jan 19, 2022 in Education by JackTerrance
0 votes
    Can anyone tell me whether Python is good for Data Science? Select the correct answer from above options...
asked Jan 19, 2022 in Education by JackTerrance
0 votes
    When I run pip3 install -r requirements.txt on the project I get this error message: pip._vendor.pkg_resources. ... on my machine. Select the correct answer from above options...
asked Jan 19, 2022 in Education by JackTerrance
...