Here's a hint at a start of what you need to do, it should be pretty easy to extend this, if it's unclear, let me know:
for starters, run your "gunk" into jq like this:
$ echo {\"index\":\"prod-h-006\",\"fields\":{\"identifier\":\"bub_gb_O2EAAAAAMAAJ\",\"title\":\"Die Wissenschaft vom subjectiven Geist\",\"creator\":[\"Karl Rosenkranz\", \"Mr. ABC123\"],\"collection\":[\"europeanlibraries\", \"americana\"],\"year\":1843,
\"language\":[\"German\"],\"item_size\":797368506},\"_score\":[50.629513]} | jq {
"index": "prod-h-006",
"fields": {
"identifier": "bub_gb_O2EAAAAAMAAJ",
"title": "Die Wissenschaft vom subjectiven Geist",
"creator": [
"Karl Rosenkranz",
"Mr. ABC123"
],
"collection": [
"europeanlibraries",
"americana"
],
"year": 1843,
"language": [
"German"
],
"item_size": 797368506
},
"_score": [
50.629513
]
}
then, start building your output like this:
echo {\"index\":\"prod-h-006\",\"fields\":{\"identifier\":\"bub_gb_O2EAAAAAMAAJ\",\"title\":\"Die Wissenschaft vom subjectiven Geist\",\"creator\":[\"Karl Rosenkranz\", \"Mr. ABC123\"],\"collection\":[\"europeanlibraries\", \"americana\"],\"year\":1843,\"
language\":[\"German\"],\"item_size\":797368506},\"_score\":[50.629513]} | jq '.fields.identifier + "|" + .fields.title'
jq is an amazing tool, it's a full fledged programming language. You just need to continue concatenating your desired output. You might even find you can do what you want all inside a jq script instead of what you're doing. Consider writing a jq
script with the first line of the script #!/usr/bin/jq
Hope this gets you on the right path!
Michael Grant
________________________________
From:
[email protected]
Sent: Friday, March 22, 2024 23:44
To: Albretch Mueller
Cc: debian-user
Subject: Re: trying to parse lines from an awkwardly formatted HAR file ...
On Sat, Mar 23, 2024 at 12:53:24AM -0500, Albretch Mueller wrote:
out of a HAR file containing lots of obfuscating js cr@p and all kinds of nonsense I was able to extract line looking like:
It's not "js cr@p", It is called JSON. And there's a spec for
it.
[...]
I have tried substring substitution, sed et tr to no avail.
You might have a lot of fun trying to parse JSON with sed and
tr.
If you are serious about it, you should try a proper parser
and extractor. I'd recommend jq [1], available in Debian under
the same-named package. I have written a few shell scripts
reaching into the innards of
You'll have to wrap your brain around it, but in the time you
have implemented a parser for js in "sed and tr" (you might
need a dash of "proper programming language" around that, some
luck and a ton of elbow grease) you might have wrapped your
brain like 16 times around jq (or some other appropriate tool).
Cheers
--
tom�s
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);">
</div>
<div id="appendonsend"></div>
<div class="elementToProof"><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);">Here's a hint at a start of what you need to do, it should be pretty easy to
extend
this, if it's unclear, let me know:</span></div>
<div class="elementToProof"><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"><br>
</span></div>
<div class="elementToProof"><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);">for starters, run your "gunk" into jq like this:</span></div>
<div class="elementToProof"><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"><br>
</span></div>
<div class="elementToProof"><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);">$ echo {\"index\":\"prod-h-006\",\"fields\":{\"
identifier\":\"bub_gb_O2EAAAAAMAAJ\",\"title\":\"Die
Wissenschaft vom subjectiven Geist\",\"creator\":[\"Karl Rosenkranz\", \"Mr. ABC123\"],\"collection\":[\"europeanlibraries\", \"americana\"],\"year\":1843,\"language\":[
\"German\"],\"item_size\":797368506},\"_score\":[50.629513]} | jq</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);">{</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> "index": "prod-h-006",</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> "fields": {</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> "identifier": "bub_gb_O2EAAAAAMAAJ",</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> "title": "Die Wissenschaft vom subjectiven Geist",</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> "creator": [</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> "Karl Rosenkranz",</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> "Mr. ABC123"</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> ],</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> "collection": [</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> "europeanlibraries",</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> "americana"</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> ],</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> "year": 1843,</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> "language": [</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> "German"</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> ],</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> "item_size": 797368506</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> },</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> "_score": [</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> 50.629513</span></div>
<div><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"> ]</span></div>
<div class="elementToProof"><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);">}</span></div>
<div class="elementToProof"><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"><br>
</span></div>
<div class="elementToProof"><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);">then, start building your output like this:</span></div>
<div class="elementToProof"><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"><br>
</span></div>
<div class="elementToProof"><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);">echo {\"index\":\"prod-h-006\",\"fields\":{\"
identifier\":\"bub_gb_O2EAAAAAMAAJ\",\"title\":\"Die
Wissenschaft vom subjectiven Geist\",\"creator\":[\"Karl Rosenkranz\", \"Mr. ABC123\"],\"collection\":[\"europeanlibraries\", \"americana\"],\"year\":1843,\"language\":[
\"German\"],\"item_size\":797368506},\"_score\":[50.629513]} | jq '.fields.identifier +
"|" + .fields.title'</span></div>
<div class="elementToProof"><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"><br>
</span></div>
<div class="elementToProof"><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);">jq is an amazing tool, it's a full fledged programming language. You just
need to continue
concatenating your desired output. You might even find you can do what you want all inside a jq script instead of what you're doing. Consider writing a jq script with the first line of the script #!/usr/bin/jq</span></div>
<div class="elementToProof"><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"><br>
</span></div>
<div class="elementToProof"><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);">Hope this gets you on the right path!</span></div>
<div class="elementToProof"><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"><br>
</span></div>
<div class="elementToProof"><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);">Michael Grant</span></div>
<div class="elementToProof" style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
</div>
<hr style="display: inline-block; width: 98%;">
<span style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);"><b>From:</b>
[email protected]<br>
<b>Sent:</b> Friday, March 22, 2024 23:44<br>
<b>To:</b> Albretch Mueller<br>
<b>Cc:</b> debian-user<br>
<b>Subject:</b> Re: trying to parse lines from an awkwardly formatted HAR file ...
</span>
<div><span style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);"><br>
</span></div>
<div><span style="font-size: 11pt;">On Sat, Mar 23, 2024 at 12:53:24AM -0500, Albretch Mueller wrote:<br>
> out of a HAR file containing lots of obfuscating js cr@p and all kinds of<br>
> nonsense I was able to extract line looking like:<br>
It's not "js cr@p", It is called JSON. And there's a spec for<br> it.<br>
[...]<br>
> I have tried substring substitution, sed et tr to no avail.<br>
You might have a lot of fun trying to parse JSON with sed and<br>
tr.<br>
If you are serious about it, you should try a proper parser<br>
and extractor. I'd recommend jq [1], available in Debian under<br>
the same-named package. I have written a few shell scripts<br>
reaching into the innards of<br>
You'll have to wrap your brain around it, but in the time you<br>
have implemented a parser for js in "sed and tr" (you might<br>
need a dash of "proper programming language" around that, some<br> luck and a ton of elbow grease) you might have wrapped your<br>
brain like 16 times around jq (or some other appropriate tool).<br>
Cheers<br>
--<br>
tom�s<br>
</span></div>
</body>
</html>
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)