I want to extract / scrape the “Matrix form” dataset from the BCS website [1], a.k.a., the data appeared in the 3rd column.
I tried with the following python code snippet, but still failed to figure out the trick:
import requests
from bs4 import BeautifulSoup
import re
proxies = {
'http': 'socks5h://127.0.0.1:18888',
'https': 'socks5h://127.0.0.1:18888'
}
requests.packages.urllib3.disable_warnings()
r = requests.get('
https://www.cryst.ehu.es/cgi-bin/plane/programs/nph-plane_getgen?gnum=17&type=plane', proxies=proxies, verify=False)
soup = BeautifulSoup(r.content, features="lxml")
table = soup.find('table')
id = table.find_all('id')
My python environment is as follows:
werner@X10DAi:~$ pyenv shell datasci
(datasci) werner@X10DAi:~$ python --version
Python 3.11.1
Any tips will be appreciated.
[1]
https://www.cryst.ehu.es/cgi-bin/plane/programs/nph-plane_getgen?gnum=17&type=plane
Regards,
Zhao
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)