|
|
|
|
|
by lukeasch21
1555 days ago
|
|
This is crazy that I stumbled upon this post, I just learned about HackerNews today and saw this on the front page! I've been working on making a sort of revival project of Cosmos browser for a couple months now using a custom encoding and better compression. With the GSM 03.38 spec, I found there are 125 usable characters that I could use as a custom base, but I found out that for some reason the 0x10 column of the character set is mostly non-functional using any SMS API like Twilio. The characters from row 0x00 to 0x0A just show up as "?" instead besides the underscore. I contacted support for multiple API providers (Twilio and Plivo) and found that this is an issue on the operator level, which is quite surprising. I doubt this will ever be fixed because the problem runs so deep. This limits the usable characters to 115, which still has surprisingly good improvements in encoding efficiency compared to base64. I also used the (relatively) new Google brotli compression algorithm which allows me to compress even more efficiently. I'm still finishing a proof of concept so I don't have a GitHub repo up yet but I will soon (I'm a first-year CS student so I need to make my code somewhat presentable before it draws the internet's scrutiny, haha). Right now the server uses Python and the client app began as a fork of Cosmos browser. It turns out that getting code from the KitKat era to work on Android 11 isn't the best solution so I'm looking into rewriting the app. Regardless, thank you so much for your work! I'm super excited to finish the proof of concept and I hope to be able to implement the peer-to-peer system you briefly mentioned in the repo by having a "server" option you could turn on in the app that leverages a user's phone number and internet connectivity to allow other users to connect to that number. Edit: I also want to add that I've looked into UCS-2 encoding (here: https://www.twilio.com/docs/glossary/what-is-ucs-2-character... ) but I'm not sure if the decrease in characters per message (only 70 instead of 160) is worth the greater amount of characters (theoretically 36,864 but probably includes many non-usable characters). It would also require a more complicated encoding algorithm, and I'm not sure how well supported this is by carriers internationally. |
|
You should put up a github repo when you're done, regardless of how you feel about the way the code looks. It will absolutely help during a job search.