[C++] C++ Standard Library part2. File I/O streams / Strings / Containers

Computing

[C++] C++ Standard Library part2. File I/O streams / Strings / Containers

ysk1m 2025. 3. 19. 23:42

File I/O Streams

<fstream> 헤더 파일의 ifstream(input file stream) class와 ofstream(output file stream) class를 이용하여 File input과 output을 관리한다.

File I/O는 standard I/O와 유사한 점이 많다.

File I/O Streams-Reading

data.txt를 읽기 위해 ifstream file객체를 지정한다.

std::ifstream file("data.txt");

file이 정상적으로 열렸나 판단한다.

file.is_open()

파일을 line단위로 읽기 위해 getline을 사용한다.

std::string line;
while (std::getline(file, line)) {
std::cout << line << std::endl;
}

string인 line를 지정하고, getline에 file에는 ifstream을 입력하고 line에는 앞서 지정한 line을 입력한다.

이후 cout을 통해 출력한다.

file.close();
return 0;

모든 작업을 마치면 파일을 닫는다.

여기서 return 값을 0으로 반환하는 이유는 C++에서는 main 함수는 정수형(int)을 반환해야 하는데, 이 반환 값을 보고 정상적으로 작동했는지 아닌지 판단한다.

0을 반환할 경우: 성공적으로 읽고 file을 닫은 경우

1을 반환할 경우: file을 읽는 것을 실패한 경우

초반에 ifstream은 iostream과 비슷한 부분이 많다고 했다.

std::ifstream file("data.txt");
int number;
while (file >> number) {
    std::cout << number << std::endl;
}
file.close();
/*
Output:
1
2
3
*/

iostream처럼 입력 버퍼에서 공백(space, tab, \n)을 만나기 전까지 데이터를 읽는다.

따라서 먼저 1을 읽고 공백을 만나 number=1이 돼서 출력된다.

그다음 2를 읽고 개행(\n)을 만나 number=2가 출력되고 마지막으로 3을 읽는다.

File I/O Streams-Writing

파일에 쓰기 위해 다음과 같이 연다.

std::ofstream file("output.txt");

file이 제대로 열렸는지 판단한다.

file.is_open();

cout에서 썼던 것처럼 동일하게 입력하고자 하는 것을 write 한다.

file << "Hello, Data Science!" <<std::endl;

file.close()를 통해 잘 닫혔는지 판단한다.

file.close();

Strings

C언어에서는 하나의 char로 구성된 array로 표현했다.

예를 들어 abc인 경우 {'a', 'b', 'c', \0}이다.

반면 C++에서는 string class을 통해 dynamic array을 사용하여 sequence of characters을 다룰 수 있다.

String은 내부적으로 다음과 같은 정보를 관리한다.(주로 stack이 아닌 heap에 저장함)

Data: 문자열을 저장하는 char array(contiguous memory block)을 가리키는 Pointer
Size: 현재 문자열의 길이
Capacity: 문자열이 저장될 수 있는 총 공간 크기

String에 새로운 문자를 추가할 때, capacity를 초과하면 내부 배열이 더 큰 메모리 블록으로 reallocation을 한다.

이러한 String을 사용하기 위해서는 #include <string>을 추가한다.

Strings-Initialization

문자열을 초기화하는 방법은 다음과 같이 두 가지 방법이 있다.

#include <string>

int main(){
	std::string str1; //Empty
    std::string str2("String2");
    std::string str3="String3";
    
    return 0;
    }

Direct initialization

생성자를 호출할 때 괄호를 이용한다.

생성자를 직접 호출하기 때문에, explicit 생성자라도 호출이 가능하다.

Copy initialization

등호를 이용하여 객체를 초기화한다.

등호를 이용할 경우 compiler가 implicit 하게 적절한 생성자를 호출한다.

단, explicit 생성자일 경우 copy initialization을 할 수없다.

따라서, Direct initialization가 가독성 및 안정성 측면에서 더 좋아 보인다.

Strings-Concatenation

Srings은 + Operator를 이용하여 concatenation을 할 수 있다.

string firstName="Data";
string lastName="Scientist";
string fullName=firstName + " " + lastName;
cout << fullName << endl; //Output: Data Scientist

또한, String은 += Operator를 이용하여 append을 할 수 있다.

string name "Data";
name+=" Scientist";(또는 name.append(" Scientist"))
cout << name << endl;

Strings-Comparison

String은 ==,!=,<,> 연산 또는 compare 메서드를 통해 비교할 수 있다.

순서는 lexicographical(사전식 가나다 순, 알파벳 순)으로 정해진다.

a가 b보다 작은 것이다.

추가로, 대/소문자를 구분해야 한다.

string str1="Apple"
string str2="Banana"

if(str1==str2){...}  
elif(str1>str2){...}
elif(str1<str2){...}

cout << str1.compare(str2);// str1이 str2보다 작으니깐 -1
cout << str2.compare(str1);// str2이 str1보다 크니깐 +1
cout << str1.compare(str1);// str1과 str1이 같으니깐 0

compare 메서드는 compare 앞을 기준으로 크기를 비교한다.

Strings-Finding Substrings

find 메서드를 이용하여 string에서 substring의 위치를 찾는다.

위치는 첫 character의 index정보다.

C++도 python과 마찬가지로 indexing이 0부터 시작한다.

find 메서드

string fullName="Data Science";
size_t pos= fullName.find("Science");
if (pos != string::npos){
	cout << "Found 'Science' at position:"<<pos<<endl;
    }

size_t는 새로운 data type으로 unsigned integer이다.

주로 size나 count를 표현한다.

npos는 상수 값으로 size_t의 최댓값이다.

이 값은 문자열을 끝까지 확인할 때 substring이 존재하는지 판단하기 위해서이다.(fine 메서드는 찾는 것을 실패하면 string::nps를 반환)

pos가 size_t의 최댓값인 npos와 같은 경우가 없으면 찾으려는 게 없는 것이다.

substr(position, length) 메서드

index position에서 시작해서 length만큼 출력한다.

string str="Data Science";
cout << str.substr(5,3);

output으로 index가 5인 S로부터 S를 포함하여 3개만큼 출력하면 "Sci"가 출력된다.

Strings-Replacing Substrings

replace(start, length, replacement) 메서드

string str="Data Science";
string replacedString=str.replace(0,4,"Bio");
cout << replacedString << endl;

output으로 index가 0인 D로부터 D를 포함하여 4개만큼 "Data"를 "Bio"로 replace 한다.

Strings-Conversion

각각 string을 정수로 실수형으로 변형한다.

std::string number="42";
int intNum=std::stoi(number);
double doubleNum=std::stod(number);

반대로 숫자를 string으로 변경할 경우이다.

double doubleValue=123.456;
std::string doubleStr=std::to_string(doubleValue);

Strings-Memory

#include <iostream>
#include <string>
using namespace std;

int main() {
    string myString = "Hello, World!";
    cout << "Initial Size: " << myString.size() << endl; // 13
    cout << "Initial Capacity: " << myString.capacity() << endl; // 예: 22
    cout << "Initial Memory Address: " << (void*)myString.c_str() << endl;
    //0x7ff7b33eb241

    // 작은 문자열 추가 (용량 내 추가)
    myString += "!";
    cout << "\nAfter small append:" << endl;
    cout << "Size: " << myString.size() << endl; // 14
    cout << "Capacity: " << myString.capacity() << endl; // 여전히 22
    cout << "Memory Address: " << (void*)myString.c_str() << endl; // 0x7ff7b33eb241

    // 큰 문자열 추가 (용량 초과로 재할당 발생)
    myString += " A large string";
    cout << "\nAfter large append:" << endl;
    cout << "Size: " << myString.size() << endl; // 증가된 값 (예: 29)
    cout << "Capacity: " << myString.capacity() << endl; // 증가된 값 (예: 47)
    cout << "Memory Address: " << (void*)myString.c_str() << endl; // 0x7ff2b6f05e30
    return 0;
}

myString에 "!"만 추가했을 때: capacity가 변하지 않는다. > 실제 데이터를 담고 있는 내부 배열의 시작 주소:0x7ff7b33eb241
myString에 "A large string"을 추가했을 때: capacity 변함> 실제 데이터를 담고 있는 내부 배열의 시작 주소:0x7ff2b6f05e30

이처럼 새로운 문자를 추가할 때, capacity를 초과하면 더 큰 memory로 reallocation을 한다.

c_str() 메서드

(void*)myString.c_str()

이 코드에 대해 더 자세하게 살펴보면

(void*)로 캐스팅하여, 문자열로 해석하지 않고(값이 integer든 double이든 상관없이) 메모리 주소를 출력한다.

.c_str()을 통해 실제 데이터를 담고 있는 내부 배열의 시작주소를 반환한다.

Stringstreams

String을 stream처럼 취급하여, file I/O나 console I/O와 유사하게 읽고 쓸 수 있는 것이다.

<sstream> header를 이용한다.

#include <iostream>
#include <sstream>
#include <string>

int main(){
	std::stringstream parser("42,3.14,Hello World");
    int intValue;
    double doubleValue;
    std::string strValue;
    char ignoreChar;
    
    parser >> intValue >> ignoreChar >> doubleValue >> ignoreChar;
    std::getline(parser,strValue);
    std::cout<<"Integer: "<< intValue <<",Double: "<< doubleValue <<", String: "<<strValue << std::endl;
    
    return 0;
    
    }

특히 여기서 "Hello World"는 line이기 때문에 문장 전체를 읽을 수 있게 getline을 이용한다.

string을 stream처럼 공백 또는 comma를 만났을 때, 다음으로 넘어간다.

intValue와 doubleValue에 각 type에 따라 저장되게 하고 ignoreChar에는 comma가 저장되게 한다.

#include <iostream>
#include <sstream>

int main(){
	std::stringstream ss; //stringstream을 초기화
    ss << 100;
    ss << 3.14;
    ss << "Hello";
    
    cout >> ss.str() >> std::endl;
    
    return 0;
    }

<< 연산자를 통해 데이터를 추가한 후, str() 메서드로 stream 내의 전체 string을 출력한다.

주의할 점은 stringstream 객체에서 읽기와 쓰기를 혼합하면 안 된다.

Containers

Vector

Python의 List와 비슷한 것으로 dynamic-size arrays를 encapsulate 한 sequence container다.

contiguous storage
random access: position index를 통해 element에 바로 접근할 수 있다.(list처럼) 따라서, 끝에 있는 원소에 대해서 insertion과 deletion이 몹시 효율적이다.
resize automatically

Vector는 대략 이렇게 생겼다.

Vector-Initialization

Vector를 사용하기 위해서는 #include <Vector> 헤더를 포함해야 한다.

Vector는 하나의 type만 element로 가질 수 있다.

std::vector<int>
std::vector<std::string>

이런 식으로 int인지 string인지 정해야 한다.

#include <iostream>
#include <vector>

int main(){
	std::vector<int> vec1={1,2,3,4,5}; //1번
    
    int arr[]={6,7,8,9,10};
    std::vector<int> vec2(std::begin(arr), std::end(arr));//2번
    
    std::vector<int> vec3(5,100);//3번
    
    return 0;
    }

1번: 직접적으로 할당한다.
2번: 정수형 배열 arr을 선언하고 5개의 요소를 초기값으로 정한다. begin(arr)와 end(arr)로 첫 번째 요소를 가리키는 포인터와 마지막 요소를 가리키는 포인터를 반환한다. vec2에 begin pointer부터 end pointer까지의 element들을 복사한다.
3번: 직접 명시하는 것으로 크기가 5이고 element가 100으로 vector를 초기화한다.

여기서 엄밀히 말하면 pointer가 아니라 iterator이다.

유사하게 작동하고 좀 더 익숙한 언어여서 사용했다. 정확히는 iterator다.

Vector-insertion

다음과 같이 헤더파일을 불러오고 vec라는 vector를 초기화한 다.

#include <iosream>
#include <vector>
using namespace std;

int main(){
	vector<int> vec={1, 2, 3, 4, 5};

6을 뒤에 넣는다.

vec.push_back(6); //{1,2,3,4,5,6}

index를 이용하여 값을 replace 한다.

여기서 인덱스 2(value는 3)인 곳을 100으로 교체한다.

vec[2]=100;//{1,2,100,4,5,6}

insert() 메서드를 이용해서 값을 넣으면

insert(주소값(pointer, iterator), 넣을 값)

vec.begin()을 통해 가장 첫 번째 element에 접근하고 거기에 1을 더하므로 2번째 위치가 된다.

2번째 위치에 20을 넣으면 된다.

vec.insert(vec.begin()+1,20);//{1,20,2,100,4,5,6}

vector의 size를 벗어나는 index에 집어넣으려 하면 undefined behavior를 야기한다.

이는 어떤 동작도 보장하지 않는 것으로 컴파일러나 실행 환경에 따라 예측할 수 없는 결과가 발생한다는 것이다.

cout << vec.at(1) << endl;
cout << vec[1] <<endl;

두 코드 모두 인덱스 1인 위치에 있는 데이터(두 번째)를 출력한다.

단, vec.at()은 element에 접근할 때 범위검사를 수행해 인덱스가 유효하지 않는다면, out of range 예외처리하여 안전하게 오류를 처리할 수 있다.

그러나 vec []은 단순히 지정한 element에 접근하는데, 잘못 접근할 경우 undefined behavior가 발생할 수 있다.

Vector-Deletion

Vector insertion과 비슷한 부분이 많다.

가장 마지막 element를 제거할 수 있다.

vector<int> vec={10, 20, 30, 40, 50};

vec.pop_back();// vec={10, 20, 30, 40}

vec.erase(vec.begin()+2);

vec insertion과 마찬가지로 vec.begin()으로 3번째 원소에 접근할 수 있다.

여기서 Vector만 +operator를 통해 iterator를 이동시킬 수 있는 것이다.

container에 따라 달라질 수 있다.

vec.begin()을 pointer 또는 iterator라 말하는데 원래는 iterator가 맞다.

중간에 있는 element를 지우면 당연히 앞으로 한 칸씩 shift 한다.

capacity는 그대로이고 size는 감소한다.

Vector의 기능에 대해 요약하면

Iteration

Range-Based For Loops

Python의 for... in.. 구문과 기본적으로 비슷하다고 생각하면 된다.

for (declaration : range)
statement;

주로 auto keyword를 이용하여 declaration을 한다.

auto기능은 element의 type을 자동으로 추론하는 것이다.

단, 주의해야 할 점은 data type을 자명하게 추론할 수 있을 때 사용해야 한다.

int main(){
	vector<int> vec={1,2,3};
    for(int val : vec){		// vec이 range이고 int val로 declaration을 함
    	cout << cal <<endl;
        }
    for(auto val : vec){       //auto기능을 사용한 경우
    	cout << val <<endl;
        )
        return 0;
   }

Iterators

iterators는 container의 요소(std::vector, std::list, std::map)들을 순차적으로 접근할 수 있도록 해주는 객체이다.

pointer와 기능적으로 비슷하게 작동한다.

하지만 모든 iterator가 pointer는 아니다.

vec.begin()과 vec.end()에 대해 살펴보겠다.

begin은 첫 번째 element을 가리키는 iterator를 반환하고, end는 마지막 element의 바로 뒤를 가리키는 iterator를 반환한다.

C++에서는 반복 구간을 [begin, end)으로 half-open range로 설정한다.

따라서 end를 element 바로 뒤를 가리키게 하면 vec의 마지막 원소까지 단순한 구간처리로 출력할 수 있다.

vector<int> vec={1, 2, 3, 4, 5};

for(auto it=vec.begin(); it !=vec.end(); ++it){
	cout << *it << endl;
    }

여기서도 it을 auto로 설정하여 type을 자동으로 추론한다.

처음 element부터 마지막 element까지 방문하고 ++i를 통해 다음 요소로 이동한다.

pointer와 마찬가지로 *을 통해 it가 가리키는 element를 참조한다.

Finding Elements

container에서 element를 찾는다.

std::find는 만약 target element가 있으면 target element를 가리키는 iterator를 반환한다.

발견하지 못한다면, 마지막 element의 바로 뒤를 가리키는 iterator를 반환한다.

예시를 보면서 더 알아보자.

먼저, find를 쓸 수 있게 <algorithm> 헤더를 include 한다.

vec를 initialization을 하고 세팅을 마친다.

#include <iostream>
#include <vector>
#include <algorithm> //std ::find를 위해서

int main(){ 
	std::vector<int> vec={1, 2, 3, 4, 5};

다음은 find에 대해 구현한 것이다.

auto를 통해 type을 자동으로 설정하고 vec.begin()부터 vec.end()까지 접근하여 3일 때의 iterator를 it에 반환한다.

만약 3이 없으면 마지막 element의 뒤를 가리키는 iterator를 it에 반환한다.

auto it =std::find(vec.begin(),vec.end(),3);

it을 찾았다면 std::distance 함수를 이용하여 val(찾고자 하는)의 iterator가 가리키는 곳과 begin()의 iterator가 가리키는 첫 번째의 거리를 구할 수 있다.

이 값이 index이다.

if(it !=vec.end()){
	std::cout <<"Found 3 at index: " << std::distance(vec.begin(), it) << std::endl;}
    else{std::cout <<"Element 3 not found." << std::endl;}
            return 0;
            }

Accumulating Elements

container의 element의 합을 구하기 위해 std::accumulate 함수를 사용한다.

이 함수는 총합을 return 한다.

#include <iostream>
#include <vector>
#include <numeric>

int main(){
	std::vector<int> vec={1, 2, 3, 4, 5};
    
    int sum=std::accumulate(vec.begin(),vec.end(),0);
    
    std::cout << "Sum: " <<sum<<std::endl;
    
    return 0;
    
    }

accumulate 함수를 쓰기 위해 <numeric> 헤더를 include 해야 한다.

accumulate 함수는 초깃값을 정해주고 iterator를 방문하면서 덧셈을 진행한다.