in Education by
I'm using beautifulsoup and selenium to scrape some data in python. Here is my code which I run through the url https://www.flashscore.co.uk/match/YwbnUyDn/#/match-summary/point-by-point/10: from selenium import webdriver from selenium.webdriver.chrome.options import Options from bs4 import BeautifulSoup DRIVER_PATH = '$PATH/chromedriver.exe' options = Options() options.headless = True options.add_argument("--window-size=1920,1200") driver = webdriver.Chrome(options=options, executable_path=DRIVER_PATH) class_name = "matchHistoryRow__dartThrows" def write_to_output(url): driver.get(url) soup = BeautifulSoup(driver.page_source, 'html.parser') print(soup.find_all("div", {"class": class_name})) return This is the schema I am trying to scrape- I would like to get the pair of spans between the colons and put them into separate columns on a csv, the problem is the class comes either before or after the colon, so I'm not sure how to go about doing this. For example:
321:501 180, 321:361140+, 224:361
I'd like this to be represented this way in a csv: player_1_score,player_2_score 321,501 321,361 224,361 What's the best way to go about this? JavaScript questions and answers, JavaScript questions pdf, JavaScript question bank, JavaScript questions and answers pdf, mcq on JavaScript pdf, JavaScript questions and solutions, JavaScript mcq Test , Interview JavaScript questions, JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)

1 Answer

0 votes
by
You can use regex to parse the scores (the easiest method, if the text is structured accordingly): import re import pandas as pd from bs4 import BeautifulSoup html_doc = """
321:501 180, 321:361140+, 224:361
""" soup = BeautifulSoup(html_doc, "html.parser") # 1. parse whole text from a row txt = soup.select_one(".matchHistoryRow__dartThrows").get_text( strip=True, separator=" " ) # 2. find scores with regex scores = re.findall(r"(\d+)\s+:\s+(\d+)", txt) # 3. create dataframe from regex df = pd.DataFrame(scores, columns=["player_1_score", "player_2_score"]) print(df) df.to_csv("data.csv", index=False) Prints: player_1_score player_2_score 0 321 501 1 321 361 2 224 361 This crates data.csv (screenshot from LibreOffice): Another method, without using re: scores = [ s.get_text(strip=True) for s in soup.select( ".matchHistoryRow__dartThrows > span > span:nth-of-type(1), .matchHistoryRow__dartThrows > span > span:nth-of-type(2)" ) ] df = pd.DataFrame( {"player_1_score": scores[::2], "player_2_score": scores[1::2]} ) print(df)

Related questions

0 votes
    I'm using beautifulsoup and selenium to scrape some data in python. Here is my code which I run ... JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Apr 7, 2022 in Education by JackTerrance
0 votes
    I'm doing some web scraping with Python and Beautiful Soup. I've encountered a problem where the results ... Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked May 22, 2022 in Education by JackTerrance
0 votes
    I am a beginner and I just need a bit of help on why I getline is showing an error: this is what I have so far ... payments[MAX_ITEMS]; ifstream iFile; if ( argc != 2 ) { cout...
asked Apr 7, 2022 in Education by JackTerrance
0 votes
    I am a beginner and I just need a bit of help on why I getline is showing an error: this is what I have so far ... payments[MAX_ITEMS]; ifstream iFile; if ( argc != 2 ) { cout...
asked Apr 7, 2022 in Education by JackTerrance
0 votes
    COPY TO exports data from a csv into table (1)False (2)True...
asked Apr 17, 2021 in Technology by JackTerrance
0 votes
    I've met an unknown error while inserting data into the database. The LogCat had been display "unable to insert data", ... I placed into my onClick(), it failed. onStart():...
asked Feb 9, 2022 in Education by JackTerrance
0 votes
    I've met an unknown error while inserting data into the database. The LogCat had been display "unable to insert data", ... I placed into my onClick(), it failed. onStart():...
asked Feb 5, 2022 in Education by JackTerrance
0 votes
    when you insert an excel file into a word document the data is _______ a) hyperlinked b)linked c)embedded d)placed ina word document Select the correct answer from above options...
asked Dec 13, 2021 in Education by JackTerrance
0 votes
    Write a program in python to insert ten friends name in a list, then print 3rd, 5th and 9th value from the list (using data structure) Select the correct answer from above options...
asked Dec 14, 2021 in Education by JackTerrance
0 votes
    Susan is so beautiful; I bet she is smart too. This is an example of __________ (a) The halo effect (b) ... prophecy (d) The recency effect Please answer the above question....
asked Oct 22, 2022 in Education by JackTerrance
0 votes
    I have a csv file and I am reading this file in python using pandas. I want to read each row of ... , JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Apr 21, 2022 in Education by JackTerrance
0 votes
    I have csv file and that looks like following. I want to remove all rows before one row values [ ... JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked May 14, 2022 in Education by JackTerrance
0 votes
    What kind of files can be imported or exported using the COPY command (1)sstables (2)CSV (3)json (4)parquet...
asked Apr 17, 2021 in Technology by JackTerrance
0 votes
    How to insert different types of slides and images into the slides and apply animation and transition on them Select the correct answer from above options...
asked Dec 24, 2021 in Education by JackTerrance
0 votes
    Which one of the following provides the ability to query information from the database and to insert tuples ... Database Interview Questions and Answers for Freshers and Experience...
asked Oct 11, 2021 in Education by JackTerrance
...