Unoffical empeg BBS

Quick Links: Empeg FAQ | RioCar.Org | Hijack | BigDisk Builder | jEmplode | emphatic
Repairs: Repairs

Topic Options
#348325 - 24/10/2011 14:27 A crypto puzzle of sorts
tonyc
carpal tunnel

Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
Boring background info

In one of the recent threads here, I mentioned my disappointment that Verizon FIOS service isn't very friendly to people who use their own home routers. If you just use FIOS for Internet you're generally fine with your own router, but if you also have TV services, and want to use Video on Demand, remote DVR scheduling, etc., you run into problems unless you use the Verizon-provided router.

The crux of the issue is that Verizon uses TR-069 to perform some automated setup/configuration of the router, and to figure out what set-top boxes are on the network, so that port forwards can be maintained to provide connectivity to the set-top boxes. You can set these port forwards up yourself, but if the Verizon-provided modem doesn't have a valid external WAN address, it won't report things back to the Verizon mothership, and you're out of luck.

The Rube Goldberg solution that people have come up with requires using three (!) routers: the primary router on the network perimiter, the Verizon-provided Actiontec router to handle some of the TV stuff for the set-top boxes (guide data, VOD, etc.) and a third router whose only job is to serve the external internet address to the Actiontec's WAN port via DHCP, tricking it into thinking it's the primary router. This is what it ends up looking like on my network:



The goal is to remove "lenny", who's doing nothing except fooling "larry' into thinking he's "barney".

Digging into the weeds, I learned more about the technical details of this TR-069 communication between the Verizon router and the mothership, and I started to think it might be possible to use some software to send the correct data back to Verizon instead of using hardware to trick the router. Though the outbound connection is SSL-encrypted by default, I found out that one can set the URL for that connection to an HTTP URL on the local network, which opens up the possibility of a man-in-the-middle proxy where we can do any substitutions we need to do, then send it out over SSL to the "real" configuration server.

I've made some good progress on this, but one piece I still haven't solved is how to decode some of the configuration values that the router obscures using what seems to be a primitive cipher. I thought I had solved it, but there's something else going on with it that I can't quite figure out, so I thought I'd pose it here as a question, because we've got a bunch of smart people who love solving hard problems. smile

The puzzle

So, with all that out of the way, here's where I'm stuck.

There are a couple of passwords stored in the router's config that are used to authenticate to Verizon's servers, but they're stored in some sort of obfuscated format, and they need to be presented to the server in plain text. The good news is that it's trivial to store a value in the config that uses the encoding/encryption scheme, so I can generate an arbitrarily long list of raw/cooked text pairs, in an attempt to reverse-engineer the encoding scheme.

The first insight I had from just eyeballing the encoding scheme is that it uses an HTML-entity-like encoding for some characters, e.g.:

Code:
&a7;TU&9b;&97;&cb;&1c;&b4;&a1;&89;3&91;&bd;e&a7;&f7;


So, ampersand, followed by a hex number, followed by a semicolon indicates a non-printable ASCII character, with other regular ASCII characters mixed in. The next observation was that, once you decode these HTML entities, the lengths of the raw and cooked values are identical. Seeing this, I started to think it was a simple 1-to-1 mapping of ASCII values, but shifted in some way, a-la ROT13 / Caesar cipher.

I tried to measure the length of each of these shifts for each input character with some simple input, e.g.

Code:
input   output          shift
0000    &86;$&1f;&80;   56 f4 ef 50


So, we start with the first character of the input string (the number zero, ascii 0x30) and add 0x56 to get 0x86 in the output. Then the second character is zero again, but this time we add f4 (wrapping around using mod 256) to get 0x24, which is a dollar sign in ASCII.

Doing a few more of these short sequences, I noticed a pattern:

Code:
input   output          shift
0000    &86;$&1f;&80;   56 f4 ef 50
aaaa    &b7;UP&b1;      56 f4 ef 50
AAAA    &97;50&91;      56 f4 ef 50
zzzz    &d0;ni&ca;      56 f4 ef 50


The next step was to see if the shifts were the same for every pair of input/output strings, and in my initial testing, it looked like they were. I came up with a sequence of shifts that seemed to work for most of the inputs I tried, but for longer / more complex strings, there would be slight differences in certain character positions, e.g.:

Code:
input     output              shift
abcdef    b7;VR&b4;&99;&10;   56 f4 ef 50 34 aa
ABCDEF    &97;62&94;y&ef;     56 f4 ef 50 34 a9


This suggests that whatever scheme they're using is doing some minor perturbation of these shifts based on... I don't know what, really. To illustrate further, here are some more samples from some sequential tests I ran (ellipses indicate that the pattern repeats):

Code:
000000           &86;$&1f;&80;d&d9;               56f4ef5034a9
000001           &86;$&1f;&80;d&da;               56f4ef5034a9
000002           &86;$&1f;&80;d&db;               56f4ef5034a9
000003           &86;$&1f;&80;d&dc;               56f4ef5034a9
...
00000U           &86;$&1f;&80;d&fe;               56f4ef5034a9
00000V           &86;$&1f;&80;d&ff;               56f4ef5034a9
00000W           &86;$&1f;&80;d&01;               56f4ef5034aa


Code:
aaaaaa           &b7;UP&b1;&95;&0b;               56f4ef5034aa
aaaaab           &b7;UP&b1;&95;&0c;               56f4ef5034aa
aaaaac           &b7;UP&b1;&95;&0d;               56f4ef5034aa
...
aaaaa|           &b7;UP&b1;&95;&26;               56f4ef5034aa
aaaaa}           &b7;UP&b1;&95;&27;               56f4ef5034aa
aaaab!           &b7;UP&b1;&96;&ca;               56f4ef5034a9
aaaab#           &b7;UP&b1;&96;&cc;               56f4ef5034a9


Things get even murkier when you start using longer strings. Here are some results from some random test runs (encoded output omitted for brevity):
Code:
input              shift
htaoJdCMoKhRlQkN   56 f4 ef 50 34 aa ef 6b 55 4b 03 3c 9b 01 78 b4 
rKNKPKEqSSTOBSEX   56 f4 ef 50 34 a9 ef 6b 55 4b 03 3c 9a 01 78 b4 
WCUPMgniVItNITJj   56 f4 ef 50 34 aa ef 6b 55 4b 03 3c 9a 01 78 b4 
yQUpwjiGqgZlPOoc   56 f4 ef 50 34 aa ef 6b 55 4b 03 3c 9a 01 78 b4 
bjOCTlTNlDQkiAsY   56 f4 ef 50 34 aa ef 6b 55 4b 03 3c 9b 01 78 b4 
yVUDfCDYchhJiEKd   56 f4 ef 50 34 a9 ef 6b 55 4b 03 3c 9b 01 78 b4 
vGknPsLbqIJomZQT   56 f4 ef 50 34 aa ef 6b 55 4b 03 3c 9b 01 78 b4 
nzkXHVZkIFDqMCvp   56 f4 ef 50 34 a9 ef 6b 55 4b 03 3c 9a 01 78 b4 
nTHihKzuywlMvCZY   56 f4 ef 50 34 a9 ef 6b 55 4b 03 3c 9b 01 78 b4 
pcrIFfIhuwrBJNsr   56 f4 ef 50 34 aa ef 6b 55 4b 03 3c 9a 01 78 b4 


In these examples, the shifts of character positions 6 and 13 are off by one in some cases. So close, yet so far!

Once you start throwing in punctiation, things get even weirder:

Code:
input              shift
c-3Lp4bUWp$>@/j]   56 f4 ef 50 34 a9 ef 6b 55 4b 03 3c 9a 01 78 b4 
3`eIP4is]}`2*ZrV   56 f4 ef 50 34 a9 ef 6b 55 4b 03 3c 9a 01 78 b4 
p=`9C$bp&]:Dxb@\   56 f4 ef 50 34 a9 ef 6b 55 4b 03 3c 9b 01 78 b4 
`8f*s&"n~f8e(~#d   56 f4 ef 50 34 a9 b5 b7 65 33 d5 69 5d 57 1d f5 
|LbLIsBx@~8Ep5Jh   56 f4 ef 50 34 aa ef 6b 55 4b 03 3c 9b 01 78 b4 
zYrslorJ%>RT4u"x   56 f4 ef 50 34 aa ef 6b 55 4b 03 3c 9a 01 3e 0a 
I$ih@qt4cYW(P4vF   56 f4 ef 50 34 aa ef 6b 55 4b 03 3c 9a 01 78 b3 
W;e;`\q/aE=9a`[4   56 f4 ef 50 34 aa ef 6b 55 4b 03 3c 9a 01 78 b3 
=flO1:o;FB6h-Oed   56 f4 ef 50 34 a9 ef 6b 55 4b 03 3c 9a 01 78 b4 
#"ZbZez|:5vX8aPy   56 ba 27 58 2c b5 04 6d 13 46 44 1e 7a 2a 67 dd 


So, there's clearly a pattern, but I can't for the life of me figure out what the algorithm is doing to induce these slight changes in how each character is shifted.

Sadly, I never took any crypto classes, and I was never all that great at math, so I'm sort of stumped on what's going on. I was hoping someone here might have some insight on other things to try, or what the algorithm could be doing behind the scenes to introduce these slight changes.

I'm sure there are probably some better forums on the Internets to ask this kind of question, but I know folks here like a challenge, so I thought I'd post it here first. And, if not, I thought it was a fun story to tell about hacking hardware and trying to make it work better, which is something dear to the hearts of most folks here.


Edited by tonyc (24/10/2011 14:29)
_________________________
- Tony C
my empeg stuff

Top
#348327 - 24/10/2011 15:10 Re: A crypto puzzle of sorts [Re: tonyc]
wfaulk
carpal tunnel

Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
The first thing that occurs to me is that the punctuation isn't ASCII-numbered. Try some sets of patterns like "!000", "0!00", "00!0", and "000!" and see if the offset of the offset is the same for each of the '!'s. If so, then the value of '!' just isn't the ASCII-normal value of 0x21.

Edit: Looking at your sample data again, it looks like that's not the case.


Edited by wfaulk (24/10/2011 15:13)
_________________________
Bitt Faulk

Top
#348328 - 24/10/2011 15:19 Re: A crypto puzzle of sorts [Re: tonyc]
wfaulk
carpal tunnel

Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
It looks like from your earlier sets of data that the offset value is high if the encoded value has a low numerical value and low if it has a high numerical value.

I also notice that the shift happens where the encoded value skips from 0xff to 0x01, skipping over 0x00.

Two things come to mind: overflow and two's-compliment numbering.
_________________________
Bitt Faulk

Top
#348330 - 24/10/2011 16:00 Re: A crypto puzzle of sorts [Re: wfaulk]
peter
carpal tunnel

Registered: 13/07/2000
Posts: 4181
Loc: Cambridge, England
Yeah, they're avoiding an output of 0x00. So it's not modulo-256, it's modulo-255 with an offset of 1. Unless you've got double-quote characters (and specifically double-quote characters, it doesn't do it for ampersand though you don't list one with a single-quote character), whereupon it does something weird. The fear of 0x00 and " is probably something to do with the HTML-entity-like encoding it ends up as.

Peter

Top
#348331 - 24/10/2011 16:05 Re: A crypto puzzle of sorts [Re: peter]
wfaulk
carpal tunnel

Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
Oh, good call on the quotation mark. That is definitely where it totally wigs out. I wonder if it's reencoding that into multiple characters. It would explain why everything that comes after it is off.
_________________________
Bitt Faulk

Top
#348332 - 24/10/2011 16:08 Re: A crypto puzzle of sorts [Re: peter]
peter
carpal tunnel

Registered: 13/07/2000
Posts: 4181
Loc: Cambridge, England
Hmm, in the double quote case it's always decimal 58 off from where it should be. Decimal 58 plus the ASCII value of double quote, is the ASCII value of backslash "\". Is it just escaping the quote? Is the answer, in that case, longer than it should be? What happens if when calculating the shift, you assume double-quote is first replaced by backslash-double-quote?

Peter

Top
#348333 - 24/10/2011 16:23 Re: A crypto puzzle of sorts [Re: peter]
wfaulk
carpal tunnel

Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
I think we can prove that pretty easily by getting the output of

aaaaaa

and

"aaaaa
_________________________
Bitt Faulk

Top
#348335 - 24/10/2011 16:49 Re: A crypto puzzle of sorts [Re: wfaulk]
peter
carpal tunnel

Registered: 13/07/2000
Posts: 4181
Loc: Cambridge, England
Look what happens when you Google 56 f4 ef...

Peter

Top
#348336 - 24/10/2011 16:56 Re: A crypto puzzle of sorts [Re: wfaulk]
tonyc
carpal tunnel

Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
Excellent observations, guys.

I was worried about the double-quote thing myself, and it's possible I erred on how I'm entering them into the router before I read them back, which is via a telnet command interface that uses commands of the form
conf set foo bar
to enter plain-text values, or
conf set_obscure foo bar
to enter values encoded using this algorithm.

The telnet command interface uses quotes to represent string values, so
conf set foo "bar"
sets foo to bar, not "bar" with the quotes embedded.

Because of this, I need to escape the quotes when I enter them in the interface. Backslash does appear to be the correct character to use for escaping quotes, as:
conf set foo \"
sets foo to &22; which is the right hex code for a double-quote. Same for the single-quote (&27;).

Here's what I'm doing to handle these in my script:

Code:
def get(s1):
    s1 = s1.replace("'", "\\'")
    s1 = s1.replace('"', '\\"')
    print s1
    conn.send_command("conf set_obscure foo %s" %(s1))
    s2 = conn.get_config("foo")['/foo']
    s3 = pwdecode3(s1, s2)
    return [ s2, s3 ]


I'll try doing the mod 255 offset 1 thing.

Bitt, Here's the output you asked for:

Code:
aaaaaa           &b7;UP&b1;&95;&0b;   56f4ef5034aa
"aaaaa           xUP&b1;&95;&0b;      1c33ef5034aa


So, yeah, something funky going on with the quotes.
_________________________
- Tony C
my empeg stuff

Top
#348337 - 24/10/2011 17:02 Re: A crypto puzzle of sorts [Re: peter]
tonyc
carpal tunnel

Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
Yeah, I actually found that (after I'd already gone through my own work to discover the offset thing, grumble...) but it doesn't seem to hold in all cases. Leaving the quote issue aside, look at the table above with "aaaaa}" and "aaaaa!". Why do the offsets on those differ?

I also discovered this link where a guy ran into the same problem:
Quote:

Tried "conf set_obscure /cwmp/password activeVOLUses1", but result was slightly off from original password encoding at the second-to-last character.


The other thing is, I don't see the offset values repeating, even after 1024 input characters. I would have expected them to repeat at some point...


Edited by tonyc (24/10/2011 17:07)
_________________________
- Tony C
my empeg stuff

Top
#348338 - 24/10/2011 17:17 Re: A crypto puzzle of sorts [Re: tonyc]
wfaulk
carpal tunnel

Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
Um, I think there's something funky going on with your script. Here's the samples reformatted:

Code:
aaaaaa           &b7; U P &b1; &95; &0b;   56 f4 ef 50 34 aa
"aaaaa           x    U P &b1; &95; &0b;   1c 33 ef 50 34 aa


It seems that the actual output differs only in the first character. But your offset calculations show that the first two letters are different.

In other words, 'U' in the second position should always return the same offset, but it's not.

Edit: It may be easier to just avoid double-quotes for now.


Edited by wfaulk (24/10/2011 17:19)
_________________________
Bitt Faulk

Top
#348339 - 24/10/2011 17:27 Re: A crypto puzzle of sorts [Re: wfaulk]
tonyc
carpal tunnel

Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
Derp.

I was correctly adding the backslash for sending the command to the router, but I have to unescape the quotes before I compare them to find the offset. Problem is, literal backslashes can be in the value, and, strangely, it doesn't seem to want you to double them up when you enter them, so it gets confusing.
_________________________
- Tony C
my empeg stuff

Top
#348340 - 24/10/2011 17:29 Re: A crypto puzzle of sorts [Re: tonyc]
tonyc
carpal tunnel

Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
A few results after fixing the quoting thing:

Code:
aaaaaa           &b7;UP&b1;&95;&0b;  56f4ef5034aa
"aaaaa           xUP&b1;&95;&0b;     56f4ef5034aa
""""""           x&16;&11;rV&cb;     56f4ef5034a9


Edited by tonyc (24/10/2011 17:30)
_________________________
- Tony C
my empeg stuff

Top
#348341 - 24/10/2011 17:44 Re: A crypto puzzle of sorts [Re: tonyc]
wfaulk
carpal tunnel

Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
In that case, it looks like we're just back to:

Code:
cryptotext = plaintext + offset
if (cryptotext > 0xFF) cryptotext -= 0xFF


So a decoder should look like:

Code:
plaintext = cryptotext - offset
if (plaintext < 0) plaintext += 0xFF


Edited by wfaulk (24/10/2011 17:46)
_________________________
Bitt Faulk

Top
#348344 - 24/10/2011 18:14 Re: A crypto puzzle of sorts [Re: wfaulk]
tonyc
carpal tunnel

Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
That's the same as what I'm doing with mod 256255. It still yields decoded results that don't match up with what we get back from the router.


Edited by tonyc (24/10/2011 18:17)
_________________________
- Tony C
my empeg stuff

Top
#348345 - 24/10/2011 18:44 Re: A crypto puzzle of sorts [Re: tonyc]
tonyc
carpal tunnel

Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
OK, I think I figured it out.

The offsets on that wiki page for the Westell are off, or are only valid for certain inputs.

I was calculating the offsets from the raw and cooked values with mod 256, but if I calculate them with mod 255 instead (no offset to avoid zero), I get offsets that seem to work. I haven't figured out what it's doing with zero yet, though.
_________________________
- Tony C
my empeg stuff

Top
#348347 - 24/10/2011 18:56 Re: A crypto puzzle of sorts [Re: tonyc]
tonyc
carpal tunnel

Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
Yeah, I got it. The winning offset values for the first 16 bytes:

Code:
56 f3 ee 50 34 a9 ee 6b 55 4b 03 3c 9a 01 78 b3


Thanks, guys.
_________________________
- Tony C
my empeg stuff

Top
#348348 - 24/10/2011 19:18 Re: A crypto puzzle of sorts [Re: tonyc]
tonyc
carpal tunnel

Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
And of course, proving that everything possible to do on the Internet has already been done:

http://www.vande-walle.eu/uploads/2010/01/openrg-decrypt.py.txt

Which points to a page with a Javascript implementation.

Oh well, it was fun, anyway.
_________________________
- Tony C
my empeg stuff

Top
#348359 - 25/10/2011 06:43 Re: A crypto puzzle of sorts [Re: tonyc]
Roger
carpal tunnel

Registered: 18/01/2000
Posts: 5685
Loc: London, UK
Originally Posted By: tonyc
... everything possible to do on the Internet has already been done ... with a Javascript implementation


Atwood's Law: any application that can be written in JavaScript, will eventually be written in JavaScript.
_________________________
-- roger

Top