cs144 checkpoint0
cs144 checkpoint0
1 装系统
最好还是用课程提供的镜像,我用的kubuntu22.04,后面遇到一堆版本问题。
2 装依赖
sudo apt update && sudo apt install git cmake gdb build-essential clang \
clang-tidy clang-format gcc-doc pkg-config glibc-doc tcpdump tshark
tshark就是cli的wireshark
3 Networking by hand
3.1 访问web页面
用telnet访问http
3.2 发邮件
没有sunetid,没成功
3.3 本地字节流
用netcat和telnet实现了本地侦听端口,两个终端中出现同步字节流。
nc -vlp 9090
telnet localhost 9090
4 webget
Writing a network program using an OS stream socket
流套接字:一个文件,两边出现同步的字节流
目标:用操作系统提供的TCP实现,写一个webget程序实现刚才手动完成的功能
4.1 初始化repo
连接到github出了点问题,需要用token或ssh key,我选了token
4.2 编译初始代码
终于知道为什么建议用他们提供的镜像了,上来就报错cmake版本不够,于是更新cmake。因为是22系统还需要添加额外的ppa
# 安装依赖
sudo apt update
sudo apt install software-properties-common wget
# 添加 Kitware 签名密钥
wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | sudo tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null
# 添加仓库
echo 'deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ jammy main' | sudo tee /etc/apt/sources.list.d/kitware.list >/dev/null
# 更新并安装 CMake
sudo apt update
sudo apt install cmake
然后编译过程中报错找不到<format>,原来是gcc/g++版本不够。一开始更新到了12还是不够,得13才行
# 添加 PPA
sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt update
# 安装 GCC 13
sudo apt install gcc-13 g++-13
# 设置为默认编译器
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-13 130
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-13 130
# 验证版本
gcc --version
g++ --version
然后终于可以编译了
cmake -S . -B build
cmake --build build
其实更新到24.04LTS就完美支持了,真是没事找事做。
4.3 Modern C++
RAII: Resource Acquisition is Initialization,对于malloc/free、new/delete这样的成对行为,要求它们必须在析构过程中实现,避免漏掉(例如函数提前返回并扔出错误,结果后半段释放内存的操作没执行)。
提了一些要求,惊人的是new/delete、指针、模板、虚函数都不建议用。
多用git,多提交,最好每个commit都能跑起来,这样容易debug。
4.4 读初始代码
minnow把操作系统的C实现封装成了Modern C++类,之后用他们的类就行。公共接口在这两个文件中
- util/socket.hh
- util/file_descriptor.hh
注意socket是一种file_descriptor,tcpsocket是一种socket。
4.5 实现webget
在apps/webget.cc中实现get_URL函数,功能就是获取http网页。
需要用到的接口:
util/socket.hh
TCPSocket()
Socket.connect(Address)
util/file_descriptor.hh
void read(string)
size_t write(string)
util/address.hh
Address()
通过这个示例可以特别直观地感受到类的继承,以及这样做的好处。
下面是webget的实现:
void get_URL( const string& host, const string& path )
{
TCPSocket sock;
sock.connect(Address(host, "http"));
string req = "GET " + path + " HTTP/1.1\r\n"
"Host: " + host + "\r\n"
"Connection: Close\r\n"
"\r\n";
sock.write(req);
while(true){
string response;
sock.read(response);
// when get EOF, stop reading and printing
// "a single call to read is not enough"
if(response.empty()){
break;
}
cout<<response;
}
debug( "Function called: get_URL( \"{}\", \"{}\" )", host, path );
debug( "get_URL() function not yet implemented" );
}
} // namespace
竟然没debug几次就成功了。第一次是string写成了String,第二次是response=sock.read()调用方法错误,第三次是把response初始化成NULL报错。
构造、使用和测试:
cmake --build build
./build/apps/webget css144.keithw.org /hello
cmake --build build --target check_webget
5 ByteStream
An in-memory reliable byte stream
内存中的可信字节流,其实就是实现一个Reader-Writer结构,不过不考虑多线程、锁一类的东西。
有三种思路作为缓冲区的数据结构
- deque<char>最简单,但是一想到要for(len),性能肯定烂完了
- deque<string>性能应该是最好的,但也是最复杂的
- deque采用了很高明的内存分配,但需要处理reader writer块长不一致的问题
- string buffer_适中选项,用一个字符串当作缓冲区
- 不知道string是怎么分配内存的,但是性能还行
准备工作
语法
- 私有变量命名为var_
- bool error_ { false }; 后面的{}是初始化
- const Reader& reader() const; const是说不会变化
- explicit ByteStream( uint64_t capacity ); explicit用来避免隐式转换,强制用构造函数初始化对象
宏观视角
- reader和writer是同一个ByteStream的不同接口
- 需要共享的状态变量设置在ByteStream的protected区域中
std::string接口:
-
void str.append( str1.data(), len )
-
void str.erase( pos, len )
-
bool str.empty()
peek和pop
- 这样设计是因为更灵活,适合网络应用场景
- string思路可以一下子peek整个缓冲区
- string_view(buffer_)仅提供引用,避免了深拷贝
需要修改什么
- .hh的protected区域的状态变量
- .cc中的各个方法
注意事项
- 每个方法都很简单,其实真不用发怵,代码量很小。不行就让copilot提示下再自己写。
- 需要注意边界条件、修改变量改的是哪个
实现
ByteStream.hh
#pragma once
#include <cstdint>
#include <string>
#include <string_view>
class Reader;
class Writer;
class ByteStream
{
public:
explicit ByteStream( uint64_t capacity );
// Helper functions (provided) to access the ByteStream's Reader and Writer interfaces
Reader& reader();
const Reader& reader() const;
Writer& writer();
const Writer& writer() const;
void set_error() { error_ = true; }; // Signal that the stream suffered an error.
bool has_error() const { return error_; }; // Has the stream had an error?
protected:
// Please add any additional state to the ByteStream here, and not to the Writer and Reader interfaces.
uint64_t capacity_;
bool error_ { false };
bool is_closed_ { false };
// 用一个字符串作为缓冲区
std::string buffer_ {};
// 要保持reader和writer状态同步,所以放在ByteStream里
uint64_t total_bytes_pushed_ {};
uint64_t total_bytes_popped_ {};
};
class Writer : public ByteStream
{
public:
void push( std::string data ); // Push data to stream, but only as much as available capacity allows.
void close(); // Signal that the stream has reached its ending. Nothing more will be written.
bool is_closed() const; // Has the stream been closed?
uint64_t available_capacity() const; // How many bytes can be pushed to the stream right now?
uint64_t bytes_pushed() const; // Total number of bytes cumulatively pushed to the stream
};
class Reader : public ByteStream
{
public:
std::string_view peek() const; // Peek at the next bytes in the buffer -- ideally as many as possible.
void pop( uint64_t len ); // Remove `len` bytes from the buffer.
bool is_finished() const; // Is the stream finished (closed and fully popped)?
uint64_t bytes_buffered() const; // Number of bytes currently buffered (pushed and not popped)
uint64_t bytes_popped() const; // Total number of bytes cumulatively popped from stream
};
/*
* read: A (provided) helper function thats peeks and pops up to `max_len` bytes
* from a ByteStream Reader into a string;
*/
void read( Reader& reader, uint64_t max_len, std::string& out );
ByteStream.cc
#include "byte_stream.hh"
#include "debug.hh"
using namespace std;
ByteStream::ByteStream( uint64_t capacity ) : capacity_( capacity ) {
}
// Push data to stream, but only as much as available capacity allows.
void Writer::push( string data )
{
if(error_ || is_closed_) {
return;
}
uint64_t can_write = available_capacity();
uint64_t to_write = min( can_write, static_cast<uint64_t>( data.size() ) );
buffer_.append( data.data(), to_write );
total_bytes_pushed_ += to_write;
}
// Signal that the stream has reached its ending. Nothing more will be written.
void Writer::close()
{
is_closed_ = true;
}
// Has the stream been closed?
bool Writer::is_closed() const
{
return is_closed_;
}
// How many bytes can be pushed to the stream right now?
uint64_t Writer::available_capacity() const
{
if( buffer_.size() >= capacity_ ) {
return 0;
}
return capacity_ - buffer_.size();
}
// Total number of bytes cumulatively pushed to the stream
uint64_t Writer::bytes_pushed() const
{
return total_bytes_pushed_;
}
// Peek at the next bytes in the buffer -- ideally as many as possible.
// It's not required to return a string_view of the *whole* buffer, but
// if the peeked string_view is only one byte at a time, it will probably force
// the caller to do a lot of extra work.
string_view Reader::peek() const
{
return string_view(buffer_);
}
// Remove `len` bytes from the buffer.
void Reader::pop( uint64_t len )
{
uint64_t to_pop = min( len, static_cast<uint64_t>(buffer_.size()) );
buffer_.erase(0, to_pop);
total_bytes_popped_ += to_pop;
}
// Is the stream finished (closed and fully popped)?
bool Reader::is_finished() const
{
return is_closed_ && buffer_.empty();
}
// Number of bytes currently buffered (pushed and not popped)
uint64_t Reader::bytes_buffered() const
{
return buffer_.size();
}
// Total number of bytes cumulatively popped from stream
uint64_t Reader::bytes_popped() const
{
return total_bytes_popped_;
}
test
cmake --build build --target check0
Debug
其实只遇到了一个bug,终端输出被吃了,下面是凭记忆写的
1. initialize capacity=15
2. Writer.close()
...
5. Writer.available_capacity() -> 0
Error: available_capacity() should be 15
错误的边界条件导致的
uint64_t Writer::available_capacity() const
{
if( error_ || is_closed_ ) {
return 0;
}
return capacity_ - buffer_.size();
}
一开始我觉得都close了available_capacity当然应该置0,不过现在看来available_capacity()应该只和buffer.size()和capacity_绑定,引入error_和is_closed_没有道理。
还有一个幽默bug,在大概第7个testcase附近终端突然疯狂输出AddressSanitizer:DEADLYSIGNAL。搜了下原来是ASLR导致的,话说之前做二进制的lab的时候还了解了一点ASLR。解决方法就是把ASLR关了。
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
结果
性能意外的还不错,达到了10Gb/s。
Test project /home/bbz/cs144/minnow/build
Start 1: compile with bug-checkers
1/11 Test #1: compile with bug-checkers ........ Passed 0.28 sec
Start 2: t_webget
2/11 Test #2: t_webget ......................... Passed 2.11 sec
Start 3: byte_stream_basics
3/11 Test #3: byte_stream_basics ............... Passed 0.02 sec
Start 4: byte_stream_capacity
4/11 Test #4: byte_stream_capacity ............. Passed 0.02 sec
Start 5: byte_stream_one_write
5/11 Test #5: byte_stream_one_write ............ Passed 0.02 sec
Start 6: byte_stream_two_writes
6/11 Test #6: byte_stream_two_writes ........... Passed 0.02 sec
Start 7: byte_stream_many_writes
7/11 Test #7: byte_stream_many_writes .......... Passed 0.14 sec
Start 8: byte_stream_stress_test
8/11 Test #8: byte_stream_stress_test .......... Passed 0.05 sec
Start 37: no_skip
9/11 Test #37: no_skip .......................... Passed 0.01 sec
Start 38: compile with optimization
10/11 Test #38: compile with optimization ........ Passed 14.25 sec
Start 39: byte_stream_speed_test
ByteStream throughput (pop length 4096): 10.39 Gbit/s
ByteStream throughput (pop length 128): 1.40 Gbit/s
ByteStream throughput (pop length 32): 0.41 Gbit/s
11/11 Test #39: byte_stream_speed_test ........... Passed 1.18 sec
100% tests passed, 0 tests failed out of 11
Total Test time (real) = 18.11 sec
Built target check0
checkpoint0就这么做完了!突然对c++有了些许兴趣与自信。