T.R | Title | User | Personal Name | Date | Lines |
---|
324.1 | | ALIEN::POSTPISCHIL | | Fri Jul 26 1985 15:05 | 32 |
| It is not possible, in general, to convert a number from one base to another
in a serial fashion, looking at only a few digits at a time.
Suppose you are trying to convert 6553n from decimal to hexadecimal. Depending
on whether the last digit is a 5 or a 6, the result could be FFFF or 10000.
Obviously, you must know the last digit of the number before you can obtain a
single digit of the result.
Similarly, if you were trying to convert from something such as trinary to
decimal, you would need to know the first digit before you could obtain any
digit of the result.
However, there are cases where such conversion may be possible. In converting
from decimal to hexadecimal, looking at the last 4 decimal digits gives
sufficient information to compute the last hexadecimal digit. A conversion can
proceed from last digit to first. (It is not even necessary to read the file
being transmitted backwards; simply consider it as a number written backwards.)
I believe a necessary and sufficient condition for this requirement is that the
destination base must be a power of some factor of the source base. (16 is the
fourth power of two, and two divides 10.) 96 and 256 do not satisfy this
requirement. Since all factors of 256 are powers of two, any destination base
must also be a power of two.
It might not be necessary to convert the file from one base to another. Such
a scheme gives an optimal compression if every byte occurs with the same
frequency, but this is not always the case. If the frequency of different
bytes is not always equal, schemes similar to those described in the base note
may be better.
-- edp
|
324.2 | | GOLLY::GILBERT | | Fri Jul 26 1985 20:36 | 20 |
| It *is* possible, in general, to convert a number from one base to another in
serial fashion, looking at only a few digits at a time, in O(N) time, where N is
the size (in 'digits') of the number, and calculations on reasonably sized
numbers (no larger that base^2) take unit time.
Let the input and output numbers be read and written (respectively) serially
from the least-significant digit to the most-significant digit. The algorithm
is very straight-forward.
If the numbers are accessed from most-significant to least-significant, simply
reverse the input number, convert the number to the new base, reverse it, and
output this reversed number. Although this requires storage for the entire
number, the reversals are *still* O(N) on the size of the numbers.
If the numbers must be accessed from most- to least-significant digit, AND only
limited storage is available, I believe the conversion can't be done in O(N)
time (in general), *unless* the number of digits in the source string is known
in advance. With the input size known in advance, the algorithm seems tricky.
- Gilbert
|
324.3 | | WHYVAX::HETRICK | | Fri Jul 26 1985 21:04 | 6 |
| Well, one response says "it can't be done", and one says "yes it can, and
it's trivial."
I'm dumb today. What is the trivial algorithm?
Brian Hetrick
|
324.4 | | ALIEN::POSTPISCHIL | | Fri Jul 26 1985 21:18 | 15 |
| Re .2:
> Let the input and output numbers be read and written (respectively) serially
> from the least-significant digit to the most-significant digit. The algorithm
> is very straight-forward.
I do not mind being contradicted, but you should support your claim. Can you
present such an algorithm?
The last 11 digits of a 12 digit number in base three are 00000000000. Please
tell me any of the six significant digits of the representation of this
number in base 10.
-- edp
|
324.5 | | TOOLS::STAN | | Fri Jul 26 1985 21:22 | 1 |
| I think he's got you there!
|
324.6 | | BAGELS::ROSENBAUM | | Fri Jul 26 1985 21:33 | 11 |
| {I interested in the answer to .4 as well, but in the mean time..}
Somewhere it was mentioned that all this is useful if the elements of the
character stream are randomly distributed. This may be the case for binary
files, although even in that case it's unlikely. In the case of plaintext,
I would think that it would much more fruitful to pursue text compression
(blank compression, huffman encoding, etc.) rather than radix conversion.
Who says we have to use the real world as our model, though?
__Rich
|
324.7 | | GOLLY::GILBERT | | Fri Jul 26 1985 23:55 | 15 |
324.8 | | ALIEN::POSTPISCHIL | | Sun Jul 28 1985 19:44 | 10 |
| Re .7:
The quoted text does not indicate the numbers are processed in serial fashion.
The last 11 digits of a 12 digit number in base three are 00000000000. Please
tell me any of the six significant digits of the representation of this
number in base 10.
-- edp
|
324.9 | | ALIEN::POSTPISCHIL | | Mon Jul 29 1985 14:16 | 25 |
324.10 | | MARIAH::LARY | | Thu Nov 07 1985 19:43 | 5 |
| In a more prosaic vein, 95^16 is a little larger (about a factor of 2) than
256^13, so you can convert the data in 13-byte groups by doing a fairly
simple (VAX instruction set peculiarities aside) set of quadword divides.
The efficiency is 16/13 or about 1.23:1
|
324.11 | 2^n base conversion trick | GUMDRP::HAINSWORTH | Shoes and ships and sealing wax | Fri Jun 12 1987 22:25 | 37 |
| Re .0
I thought of base 64 before I finished reading the first page of
your note.
As somebody mentioned earlier, there IS a trick to converting between
bases that are powers of the same number. Let me illustrate with
hexadecimal (2^4) to octal (2^3) conversion:
| |
Hexadecimal | D | 9 | E | 2 |
Binary 1 1 0 1 0 1 0 1 1 1 1 0 0 0 1 0|
Octal 1 | 5 | 2 | 7 | 4 | 2 |
| |
Note that these numbers line up every 12 binary digits, and thus
every 3 hexadecimal digits will turn into 4 octal digits (becoming
12 binary digits in between). Thus the process is "encapsulated":
as soon as you accumulate 3 hexadecimal digits, you can send out
4 octal digits and clear your buffer.
The repeating period (number of digits) is the least common multiple
of the two powers (in this case, 4 and 3) to which the root number
(in this case, 2) is raised to get the two bases.
So how can we efficiently map a base 256 number into a base that's
no bigger than 96? Hmmm...
Well, 256 = 2^8.
The biggest power of 2 that's less than 96 is 2^6, or 64.
To map 2^8 into 2^6, I need a buffer for LCM(8,6) = 24 bits.
That's 3 input characters, or 4 output characters.
I don't think you can do any better without opening Pandora's box
of generalized base conversion.
John
|