77 lines
2.2 KiB
Plaintext
77 lines
2.2 KiB
Plaintext
1. polipo not active as an offline proxy or css not working or?
|
|
|
|
----------------------
|
|
502 Disconnected operation and object not in cache
|
|
|
|
The following error occurred while trying to access http://solicitors.lawsociety.org.uk/office/476844/a-e-payne-limited:
|
|
|
|
502 Disconnected operation and object not in cache
|
|
Generated Wed, 26 Aug 2015 12:04:20 BST by Polipo on sparky:8123.
|
|
----------------------
|
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
|
<html>
|
|
<head>
|
|
<title>Proxy error: 502 Disconnected operation and object not in cache.</title>
|
|
</head>
|
|
<body>
|
|
<h1>502 Disconnected operation and object not in cache</h1>
|
|
<p>The following error occurred while trying to access <strong>http://bugs.pearsoncomputing.net/show_bug.cgi?id=1998</strong>:<br><br>
|
|
<strong>502 Disconnected operation and object not in cache</strong></p>
|
|
<hr>Generated Thu, 27 Aug 2015 12:12:42 BST by Polipo on <em>sparky:8123</em>.
|
|
</body>
|
|
</html>
|
|
----------------------
|
|
xpath
|
|
.//html/body/p/strong[2].text()
|
|
|
|
Either add code to catch 502 from polipo and put the unfetched url in a file or;
|
|
check to see polipo is working in that it only fetches unfected pages.
|
|
|
|
2. are string() xpaths returning more than one item when needed?
|
|
|
|
Person
|
|
Roles at this organsation - first only
|
|
Roles at other organisations - missing
|
|
|
|
Office
|
|
|
|
|
|
3. lawsoc_prefix.txt not being updated so restarts from beginning again.
|
|
|
|
4. signal.SIGHUP doesn't work for non root when killing tor, rotating proxy doesn't work when privoxy points to it but anyway we are using tor browser atm.
|
|
|
|
5. duplicates added to all files
|
|
|
|
6. add elastic search mapping for lawsoc/office and lawsoc/person to specify:
|
|
lawsoc/person
|
|
a. date format for the date field
|
|
b. _parent for person solicitor_id
|
|
c. unique key on lawsocs person_id, copy person_id -> _id
|
|
|
|
{
|
|
"person" : {
|
|
"person_id" : {
|
|
"type" : "string",
|
|
"index" : "not_analysed",
|
|
"copy_to" : "_id"
|
|
}
|
|
}
|
|
}
|
|
|
|
lawsoc/office
|
|
a. unique key on solicitor_id, copy solicitor_id -> _id
|
|
b. geo_point for location
|
|
|
|
{
|
|
"office" : {
|
|
"solicitor_id" : {
|
|
"type" : "string",
|
|
"index" : "not_analysed",
|
|
"copy_to" : "_id"
|
|
}
|
|
}
|
|
}
|
|
|
|
7. Copy to David
|
|
rsync -avz -e ssh markm@igm-legal.co.uk:/var/tmp/lawsoc_new /var/tmp/
|