Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 58 additions & 1 deletion include/base64.h
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,12 @@
#include <QString>
#include <QVariant>

#include <array>
#include <map>
#include <numeric>
#include <string>
#include <string_view>

namespace lmms::base64
{

Expand All @@ -53,4 +59,55 @@ namespace lmms::base64

} // namespace lmms::base64

#endif
namespace lmms::base64 {
constexpr inline std::array<char, 64> map =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

constexpr implies inline, so inline should be redundant here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might not be true for global variables, but regardless, I think all of these implementation details do not belong in the header anyway and should be moved to base64.cpp. Then they certainly will not need the inline.

{
'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z',
'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z',
'0','1','2','3','4','5','6','7','8','9',
'-','_'
};
const inline std::map<char, int> rmap {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of things:

  • Why not use constexpr here as well? If anything, it will be consistent with map.
  • Couldn't we have used a better name, maybe reverseMap instead of rmap?
  • Subjective, but could we also add a space in between the preceding and succeeding constexpr variables? (Usually I wouldn't mind if they were each one line, but they span multiple)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, std::map does not have a constexpr constructor so rmap cannot be constexpr, but I agree with your other points.

{'A', 0}, {'B', 1}, {'C', 2}, {'D', 3}, {'E', 4},
{'F', 5}, {'G', 6}, {'H', 7}, {'I', 8}, {'J', 9},
{'K', 10}, {'L', 11}, {'M', 12}, {'N', 13}, {'O', 14},
{'P', 15}, {'Q', 16}, {'R', 17}, {'S', 18}, {'T', 19},
{'U', 20}, {'V', 21}, {'W', 22}, {'X', 23}, {'Y', 24},
{'Z', 25}, {'a', 26}, {'b', 27}, {'c', 28}, {'d', 28},
{'e', 30}, {'f', 31}, {'g', 32}, {'h', 33}, {'i', 34},
{'j', 35}, {'k', 36}, {'l', 37}, {'m', 38}, {'n', 39},
{'o', 40}, {'p', 41}, {'q', 42}, {'r', 43}, {'s', 44},
{'t', 45}, {'u', 46}, {'v', 47}, {'w', 48}, {'x', 49},
{'y', 50}, {'z', 51}, {'0', 52}, {'1', 53}, {'2', 54},
{'3', 55}, {'4', 56}, {'5', 57}, {'6', 58}, {'7', 59},
{'8', 60}, {'9', 61}, {'-', 62}, {'_', 63} //, {'=', 64}
};
constexpr char pad = '=';

/*
This section of math ensures that base64 encode/decode will work
as intended. Some rare architectures don't use 8-bit char's, and
it's possible this won't work as intended if a char isn't 8-bits.

In the rare case this is ported to an architecture where this
happens, feel free to comment out the static_assert's and test.
*/
constexpr int char_bits = std::numeric_limits<std::string_view::value_type>::digits;
constexpr int sign_bit = std::numeric_limits<std::string_view::value_type>::is_signed ? 1 : 0;
// check that the string_view character type is 8 (7 signed + 1 sign) bits wide
static_assert(char_bits + sign_bit == 8);
constexpr int numBitsPerChar = char_bits + sign_bit;
constexpr int numBitsPerBase64Char = 6;
constexpr int lcm = std::lcm(numBitsPerChar, numBitsPerBase64Char);
// make sure math works, 24 bits
static_assert(lcm == 24);
constexpr int numBytesPerChunk = lcm / numBitsPerChar;
constexpr int numBase64CharPerChunk = lcm / numBitsPerBase64Char;
// double check math works and bit width matches
static_assert(numBytesPerChunk * numBitsPerChar == numBase64CharPerChunk * numBitsPerBase64Char);
Comment on lines +87 to +107
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do like your attention towards portability, but a char not being 8 bits is very unlikely, especially given the architectures we are targeting. Even on architectures we aren't directly supporting (think ARM), char is still 8 bits.

This can be somewhat subjective, but the tradeoff for more code complexity over certain things that we can almost agree will certainly be the case, like a char being 8 bits, might not be worth it.


auto encode(std::string_view data) -> std::string;
auto decode(std::string_view data) -> std::string;
Comment on lines +109 to +110
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't these functions be made static?

} // namespace lmms::base64

#endif // _BASE64_H
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would've conflicted with the LMMS_ header guard prefix change if it happened on the same line.

132 changes: 132 additions & 0 deletions src/core/base64.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -54,4 +54,136 @@ QVariant decode( const QString & _b64, QVariant::Type _force_type )
}


//TODO C++20: It *may* be possible to make the following functions constepxr given C++20's constexpr std::string.

/**
* @brief Base64 encodes data.
* @param data
* @return std::string containing the Base64 encoded data.
*/
Comment on lines +57 to +63
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These doxygen comments may be more useful in the header file rather than the .cpp file as it is more easily accessible there.

PS: I know I was the one who added the C++20 TODO comment, but maybe integrate it better into the main doxygen comment with @todo.

auto encode(std::string_view data) -> std::string
{
if (data.empty()) { return ""; }

// base64 encoded string
std::string result;

// number of chunks to process + padding
auto div_result = std::div(data.length(), numBytesPerChunk);
auto numChunks = div_result.quot;
auto numTrailingBytes = div_result.rem;

// add 1 chunk to handle last padded chunk
if (numTrailingBytes) { ++numChunks; }

// allocate one time to prevent in-loop reallocations
result.reserve(numChunks * numBase64CharPerChunk);

/*
char1 char2 char3
7 6 5 4 3 2 1 0 | 7 6 5 4 3 2 1 0 | 7 6 5 4 3 2 1 0
5 4 3 2 1 0 | 5 4 3 2 1 0 | 5 4 3 2 1 0 | 5 4 3 2 1 0
b64c1 b64c2 b64c3 b64c4
*/
// Get the first base64 character offset from the char chunk
auto b64char1 = [](std::string_view chunk) {
return static_cast<std::string::value_type>(chunk[0] >> 2);
Comment on lines +89 to +90
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Any reason for passing std::string_view by value?
  • Given that std::string is an alias of std::basic_string<char>, std::string::value_type is always char.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • My understanding of string_view was that this is the way to do it. If not then I can change it. 👍
  • My thought was in the instance for some reason std::string isn't based on char.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of string_view was that this is the way to do it.

It turns out you're correct, indeed.

};
// Get the second base64 character offset from the char chunk
auto b64char2 = [](std::string_view chunk) {
return static_cast<std::string::value_type>(
((chunk[0] & 0x03) << 4)
|
(chunk.size() > 0 ? ((chunk[1] & 0xF0) >> 4) : 0)
);
};
// Get the third base64 character offset from the char chunk
auto b64char3 = [](std::string_view chunk) {
return static_cast<std::string::value_type>(
(chunk.size() > 0 ? ((chunk[1] & 0x0F) << 2) : 0)
|
(chunk.size() > 1 ? ((chunk[2] & 0xC0) >> 6) : 0)
);
};
// Get the fourth base64 character offset from the char chunk
auto b64char4 = [](std::string_view chunk) {
return static_cast<std::string::value_type>(chunk.size() > 1 ? (chunk[2] & 0x3F) : 0);
};
for (int currentChunk = 0; currentChunk < numChunks; ++currentChunk)
{
std::string_view chunk = data.substr(currentChunk * numBytesPerChunk, numBytesPerChunk);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::string_view chunk = data.substr(currentChunk * numBytesPerChunk, numBytesPerChunk);
auto chunk = data.substr(currentChunk * numBytesPerChunk, numBytesPerChunk);

std::array<char, 4> output{
map[b64char1(chunk)],
map[b64char2(chunk)],
pad,
pad
};
switch (chunk.length()) {
case 3:
output[3] = map[b64char4(chunk)];
case 2:
output[2] = map[b64char3(chunk)];
default: /* no-op */;
Comment on lines +122 to +126
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to add [[fallthrough]]; when falling through. I prefer explicitly breaking on case 2 and default, but I'll leave it up to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not aware of this, I'll add them.

};
result.append(output.data(), output.size());
}
return result;
}

/**
* @brief Decodes data in Base64.
* @param data
* @return std::string containing the original data.
*/
Comment on lines +133 to +137
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before:

These doxygen comments may be more useful in the header file rather than the .cpp file as it is more easily accessible there.

auto decode(std::string_view data) -> std::string
{
if (data.empty()) { return ""; }
if (data.length() % numBase64CharPerChunk != 0) {
throw std::length_error("base64::decode : data length not a multiple of 4");
}

std::string result;

// number of chunks to process
auto numChunks = std::div(data.length(), numBase64CharPerChunk);

// allocate one time to prevent in-loop reallocations
result.reserve(numChunks.quot * numBytesPerChunk);
/*
char1 char2 char3
7 6 5 4 3 2 1 0 | 7 6 5 4 3 2 1 0 | 7 6 5 4 3 2 1 0
5 4 3 2 1 0 | 5 4 3 2 1 0 | 5 4 3 2 1 0 | 5 4 3 2 1 0
b64c1 b64c2 b64c3 b64c4
*/
// Get the first character from the base64 chunk
auto char1 = [](std::string_view chunk) {
return static_cast<std::string::value_type>(
(rmap.at(chunk[0]) << 2) | (rmap.at(chunk[1]) >> 4)
);
};
// Get the second character from the base64 chunk
auto char2 = [](std::string_view chunk) {
return static_cast<std::string::value_type>(
chunk[2] == pad
? 0
: ((rmap.at(chunk[1]) & 0x0F) << 4) | (rmap.at(chunk[2]) >> 2)
);
};
// Get the third character from the base64 chunk
auto char3 = [](std::string_view chunk) {
return static_cast<std::string::value_type>(
chunk[3] == pad
? 0
: ((rmap.at(chunk[2]) & 0x03) << 6) | rmap.at(chunk[3])
);
};
for (int currentChunk = 0; currentChunk < numChunks.quot; ++currentChunk) {
std::string_view chunk = data.substr(currentChunk * numBase64CharPerChunk, numBase64CharPerChunk);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using auto here is a plus.

Suggested change
std::string_view chunk = data.substr(currentChunk * numBase64CharPerChunk, numBase64CharPerChunk);
auto chunk = data.substr(currentChunk * numBase64CharPerChunk, numBase64CharPerChunk);

result.push_back(char1(chunk));
result.push_back(char2(chunk));
result.push_back(char3(chunk));
}
return result;
}

} // namespace lmms::base64
1 change: 1 addition & 0 deletions tests/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ ADD_EXECUTABLE(tests
$<TARGET_OBJECTS:lmmsobjs>

src/core/AutomatableModelTest.cpp
src/core/base64Test.cpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be src/core/Base64Test.cpp.

src/core/ProjectVersionTest.cpp
src/core/RelativePathsTest.cpp

Expand Down
73 changes: 73 additions & 0 deletions tests/src/core/base64Test.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
/*
* base64Test.cpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* base64Test.cpp
* Base64Test.cpp

*
* Copyright (c) 2022 Kevin Zander <veratil/at/gmail.com>
*
* This file is part of LMMS - https://lmms.io
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program (see COPYING); if not, write to the
* Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
* Boston, MA 02110-1301 USA.
*
*/

#include "QTestSuite.h"

#include "base64.h"

class Base64Test : QTestSuite
{
Q_OBJECT
private slots:
void create_test_data()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void create_test_data()
void createTestData()

{
QTest::addColumn<QString>("original");
QTest::addColumn<QString>("encoded");

// Test Vectors from RFC 4648 Section 10
QTest::newRow("empty string") << "" << "";
QTest::newRow("1 chunk 2 pad") << "f" << "Zg==";
QTest::newRow("1 chunk 1 pad") << "fo" << "Zm8=";
QTest::newRow("1 chunk 0 pad") << "foo" << "Zm9v";
QTest::newRow("2 chunk 2 pad") << "foob" << "Zm9vYg==";
QTest::newRow("2 chunk 1 pad") << "fooba" << "Zm9vYmE=";
QTest::newRow("2 chunk 0 pad") << "foobar" << "Zm9vYmFy";
}
void b64_encode_data()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void b64_encode_data()
void b64EncodeData()

{
create_test_data();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
create_test_data();
createTestData();

}
void b64_encode()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void b64_encode()
void b64Encode()

{
using namespace lmms::base64;

QFETCH(QString, original);
QFETCH(QString, encoded);
QCOMPARE(QString(encode(original.toStdString()).c_str()), encoded);
}
void b64_decode_data()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void b64_decode_data()
void b64DecodeData()

{
create_test_data();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
create_test_data();
createTestData();

}
void b64_decode()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void b64_decode()
void b64Decode()

{
using namespace lmms::base64;

QFETCH(QString, original);
QFETCH(QString, encoded);
QCOMPARE(original, QString(decode(encoded.toStdString()).c_str()));
}
Comment on lines +33 to +70
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would personally add spaces between function definitions.

} Base64Tests;

#include "base64Test.moc"